Table of Contents
- Introducing the HashSet in C#
- 1. Remove Duplicates From a List
- 2. Fast Lookups in Large Lists
- 3. Find Shared Tags Between Two Collections
- 4. Merge Two Collections (Union)
- 5. Find What’s Missing (Difference)
- 6. Compare Two Lists for Exclusives (Symmetric Difference)
- 7. De-duplicate Custom Objects by Property
- HashSet vs List Comparison
- Thread Safety and Memory Considerations
- Wrapping Up
Introducing the HashSet in C#
If you’re working with large datasets or need lightning-fast lookups, HashSet in C# might just be your secret weapon. Unlike traditional collections, HashSet offers some unique advantages in terms of performance, especially when it comes to checking for the existence of elements. In this article, we’ll look at why HashSet in C# is a game-changer — with real-world use cases, performance comparisons, and code examples that show exactly when and how to use it.
It’s really easy to reach for a standard List object when you’re building out your apps in .NET. It does pretty much everything you need (most of the time): it implements IEnumerable, works well with LINQ, has sorting built in, and lets you pull values out by index really easily. In most cases, it’s the right tool for the job because it just works — especially when you need to store and manipulate a collection of items, like imported datasets of one kind or another. But sometimes, there’s a lighter choice that can work even better with certain types of data.
If you’ve ever needed to make sure a collection contains only unique values, or wanted a faster way to check whether something already exists, HashSet in C# might be exactly what you’re looking for. It’s a no-duplicates, no-fuss collection that includes built-in set operations like intersection, union, and difference. Let’s take a look at where it comes in handy, and a few examples that show how powerful it can be.
1. Remove Duplicates From a List
Starting with a regular list of strings, let’s use a HashSet<string> to dedupe emails:
var emails = new List<string> {
"alex@jkrussell.dev", "sam@jkrussell.dev", "alex@jkrussell.dev", "chris@jkrussell.dev"
};
var uniqueEmails = new HashSet<string>(emails);
foreach (var email in uniqueEmails)
{
Console.WriteLine(email);
}
Output:
alex@jkrussell.dev
sam@jkrussell.dev
chris@jkrussell.devA real-world scenario where you might use this is when you’ve accepted bulk user input and you want to dedupe email addresses or other data quickly.
2. Fast Lookups in Large Lists
Lookup speed is one of the biggest advantages of using a HashSet in C#. Here’s the best way to check if your list contains a given value:
var bannedEmails = new HashSet<string> {
"admin@jkrussell.dev", "test@jkrussell.dev", "null@jkrussell.dev"
};
if (bannedEmails.Contains("admin@jkrussell.dev"))
{
Console.WriteLine("Email is blocked.");
}Output:
Email is blocked.HashSet is often 10–100x faster than using a List for lookups, though you won’t really feel it from small examples like this. But as the size of your collections grow there are real and tangible benefits to using them.
Let’s look at a comparison of how some of the common collection types cope at scale:
| Collection Type | .Contains() Time Complexity | Notes |
List<T> | O(n) — linear | Checks each item one by one — slower as the list gets longer |
Array | O(n) — linear | Same as List — no shortcuts for lookups |
Dictionary<Key,Value> | O(1) — constant | Instantly finds keys using a hash — fast even with large data |
HashSet<T> | O(1) — constant | Like Dictionary but for single values — always fast and efficient |
The Big-O notation might seem a bit complicated to understand, but it’s a really useful way of describing how fast (or slow) an algorithm is as the input grows — it’s a shorthand for performance scaling. Think of it as measuring how many steps something takes as the list or data gets bigger.
Here’s a chart that illustrates how lookup performance differs across common C# collections when using the .Contains() method:

In the chart, the dashed lines represent constant-time performance as the number of elements grows. The green (HashSet<T>) and blue (Dictionary<K,V>) dashed lines illustrate their speed. The red and orange lines (List<T> and Array) climb steadily, showing that these data structures slow down as their size increases.
3. Find Shared Tags Between Two Collections
Compare two lists for common items:
var postTags = new HashSet<string> { "csharp", "aspnet", "linq", "backend" };
var userTags = new List<string> { "linq", "frontend", "Backend", "csharp", "api" };
postTags.IntersectWith(userTags);
foreach (var tag in postTags)
{
Console.WriteLine($"Matched tag: {tag}");
}Output:
Matched tag: csharp
Matched tag: linqIn the example, we used IntersectWith() to modify the postTags HashSet. If you wanted to use a non-destructive method, just use Intersect():
var sharedTags = postTags.Intersect(userTags);In both cases, it’s important to note that a HashSet<T> is case-sensitive by default. In the example above, if one collection contains “backend” and the other contains “Backend”, they will not be considered a match and will not appear in the output. If you did want to ignore any casing differences, declare your HashSet like this:
var postTags = new HashSet<string>(StringComparer.OrdinalIgnoreCase)4. Merge Two Collections (Union)
You’re probably used to using Linq queries and calling .Distinct() to filter out duplicates in a list. But what about when you want to bring two datasets together? Using a HashSet in C# is one of the most efficient ways to merge two collections without duplicates — especially useful when email addresses or other identifiers must remain unique. Unlike List, HashSet enforces uniqueness automatically.
Here’s an example of how it can be used in a scenario bringing two sets of emails together:
var subscribedEmails = new HashSet<string> { "alex@jkrussell.dev", "sam@jkrussell.dev" };
var newEmails = new List<string> { "sam@jkrussell.dev", "bob@jkrussell.dev" };
subscribedEmails.UnionWith(newEmails);
foreach (var email in subscribedEmails)
{
Console.WriteLine(email);
}Output:
alex@jkrussell.dev
sam@jkrussell.dev
bob@jkrussell.devOrder isn’t preserved in a HashSet, so if that’s something you need you’ll either have to post-sort or use a different data structure.
5. Find What’s Missing (Difference)
Get items in list A not in list B:
var allFeatures = new HashSet<string> { "Login", "Signup", "DarkMode", "Export" };
var completedFeatures = new List<string> { "Login", "Signup" };
allFeatures.ExceptWith(completedFeatures);
foreach (var feature in allFeatures)
{
Console.WriteLine($"Still to do: {feature}");
}Output:
Still to do: DarkMode
Still to do: ExportRemember earlier how we used IntersectWith() to find shared items, and Intersect() as a non-destructive alternative? The same principle applies here: ExceptWith() modifies the original set, while Except() returns a new collection without altering the source:
// Destructive
emails.ExceptWith(unsubscribed);
// Non-destructive
var filtered = emails.Except(unsubscribed);6. Compare Two Lists for Exclusives (Symmetric Difference)
Compare two lists and return differences:
var listA = new HashSet<string> { "feature1", "feature2", "feature3" };
var listB = new HashSet<string> { "feature2", "feature3", "feature4" };
listA.SymmetricExceptWith(listB);
foreach (var item in listA)
{
Console.WriteLine($"Unique to one list: {item}");
}Output:
Unique to one list: feature1
Unique to one list: feature4The same rules apply here as in the previous examples: SymmetricExceptWith is destructive, whilst SymmetricExcept will return a new collection.
7. De-duplicate Custom Objects by Property
The final example is really interesting, and highlights just how a HashSet<T> determines if any two objects are the same. Check out this code, which you’ll notice has Equals() and GetHashCode() properties:
var products = new HashSet<Product>
{
new Product { SKU = "123" },
new Product { SKU = "123" },
new Product { SKU = "456" }
};
foreach (var p in products)
{
Console.WriteLine($"Product: {p.SKU}");
}
class Product : IEquatable<Product>
{
public string SKU { get; set; }
public override int GetHashCode() => SKU.GetHashCode();
public bool Equals(Product? other) => other is not null && SKU == other.SKU;
public override bool Equals(object? obj)
{
return Equals(obj as Product);
}
}Output:
Product: 123
Product: 456You’ll notice from the code that there are two override members: Equals() and GetHashCode(). These are both added by using the implementation of IEquatable<Product>. The object in the overridden method Equals() is then passed as a Product to our own implementation for further processing. HashSet<T> uses these methods to determine if two objects are the same:
GetHashCode()— to quickly group or locate candidates (bucket).Equals()— to confirm actual equality within that group.
If you’re using a custom class like Product, and you don’t override these two methods, each new Product { SKU = "123" } will be treated as a different object, even if the SKU string is identical — because by default, object equality compares reference, not content.
HashSet vs List Comparison
If you’re considering doing a find and replace on every single List<T> you’ve ever created with a HashSet<T>, you might want to weigh up how and compare which one is better suited to your use case:
| Feature | List<T> | HashSet<T> |
| Duplicates allowed | Yes | No (enforced by default) |
| Ordering | Maintains insertion order | No guaranteed order |
| Lookup performance | O(n) — linear | O(1) — constant |
| Add performance | O(1) — amortised | O(1) — amortised |
| Remove performance | O(n) | O(1) — on average |
| Memory usage | Lower (compact layout) | Higher (uses hash buckets) |
| Set operations | Manual (via LINQ) | Built-in (e.g. IntersectWith) |
| Best for | Ordered lists, duplicates, small datasets | Fast lookups, uniqueness, large datasets |
In data structures, “amortised” refers to the average performance of an operation over time, even if some individual operations might occasionally take a bit longer.
Thread Safety and Memory Considerations
One word of caution: HashSet is not thread-safe. If your app is multi-threaded or you’re doing anything involving parallel tasks that read and write to the same collection, you’ll need to add your own locking or use a concurrent structure. You’re probably not, but it’s worth noting. It’s safe to Contains() on a read-only set from multiple threads, but once you start modifying it (adding or removing items), you’ll need to wrap it in a lock or move to something like ConcurrentDictionary.
Another thing to keep in mind is memory. HashSet is super-fast — but you pay a little for that speed in terms of memory usage. Internally, it uses buckets and hash codes, which makes lookups near-instant. But that structure has more overhead than something like a simple list, which just stores values linearly in memory. It’s not usually a problem unless you’re working with huge datasets or on memory-constrained systems, but it’s worth being aware of just in case.
Wrapping Up
HashSets in .NET are cool. If you don’t use them already, you should try them out — they’re fast, lightweight, and incredibly useful when you’re dealing with uniqueness, lookups, or set-based operations. While a List might be your default go-to, it’s always worth asking: do I actually need ordering or duplicates? If not, HashSet could be a better choice.
That said, they won’t always be the right tool for every job. If you care about preserving item order, need indexed access, or are working with small datasets where performance isn’t a concern, a list might still be the better fit. But when speed and uniqueness matter and you want a super-scalable tool, give the trusty HashSet a go!

