Re: number of calls to hash function in unordered_set

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 11 Jun 2018, Frank Tetzel wrote:

Hi,

can anybody give me a reasonable explanation why the hash function for
a simple unordered_set is called twice as much as I would expect.

Example:


#include <iostream>
#include <unordered_set>

struct MyHash{
	static int numberOfCalls;

	size_t operator()(int i) const noexcept{
		++numberOfCalls;
		//return std::hash<int>{}(i);
		return i;
	}
};
int MyHash::numberOfCalls = 0;

int main(){
	std::unordered_set<int,MyHash> set;

	std::cout << MyHash::numberOfCalls << '\n';
	set.reserve(1024);
	std::cout << MyHash::numberOfCalls << '\n';
	for(int i=0; i<1024; ++i){
		set.emplace(i);
	}
	std::cout << MyHash::numberOfCalls << '\n';

	return 0;
}




output:
0
0
2047

It seems to always be (inserted_elements * 2) - 1.
When adding 2 elements, it hashes 3 times. When adding 1 element, it
only hashes once. What is happening here? Why is the hash calculated
twice? What is so special about the last or first element?

https://gcc.gnu.org/onlinedocs/libstdc++/manual/unordered_associative.html#containers.unordered.cache answers part of the question (libc++ on the other hand seems to have chosen always to cache the hash).

The other part is related to how the elements are linked in the hash table, note that the hashes computed are 0 1 0 2 1 3 2 4 3 5 4 6 5 7 6... Reading the code seems best to understand what is going on.

--
Marc Glisse



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux