Tyler Earman wrote: > Does anyone know of a semi-efficient way of detecting duplicates in a > given set of data (specifically duplicate integers) other than just > sorting the data into an array and then stepping through each element in > the array? > > Preferably a solution wouldn't involve something implementation specific > (like loading the array into a different data type and measuring the > differences between the two, saw someone doing that with Java ArrayLists > and HashSets) but rather some sort of mathematically correct method of > doing it. I don't believe there's any way of doing this that is more efficient than a hash table. An informal proof of this is that you have to store each element somewhere, and you only check each element once for duplicates, given a hash table sufficiently large to make collisions improbable. The time to insert one element into a hash table is ~ O(1). So, the runtime for detecting duplicates in a bag of N elements is ~ O(n). Andrew.