Remove duplicates without using LINQ Distinct()?
Remove duplicates without using LINQ Distinct()
What interviewers test
- Hashing fundamentals
- Time vs memory trade-offs
- Whether you understand why Distinct() works
Real-world scenario
You are processing millions of user IDs coming from logs or Kafka messages.
You must remove duplicates fast and predictably.
Optimized approach (HashSet)
public static List<int> RemoveDuplicates(int[] input)
{
var seen = new HashSet<int>();
var result = new List<int>();
foreach (var item in input)
{
if (seen.Add(item)) // Add returns false if already exists
{
result.Add(item);
}
}
return result;
}
Complexity
Aspect Value
Time O(n)
Space O(n) (hash storage)
Why this is better than naive loops
- Nested loops = O(n²) ❌
- HashSet = constant-time lookup ✅
Interview Tip:
Say “I prefer HashSet because it gives O(1) average lookup and preserves intent clearly.”