On Sun, Feb 25, 2024 at 08:50:36AM +0800, Herbert Xu wrote: > On Sat, Feb 24, 2024 at 10:10:27PM +0000, David Laight wrote: > > > > > Normally an rhashtable gets resized when it reaches 75% capacity > > > so the average chain length should always be one. > > > > The average length of non-empty hash chains is more interesting. > > You don't usually search for items in empty chains. > > The only way you'll get all the chains of length one is if you've > > carefully picked the data so that it hashed that way. > > Sure. But given the 75% capacity, you'd need a really bad hash > function to get an *average* (not worst-case) chain length of > 10. > > > I remember playing around with the elf symbol table for a browser > > and all its shared libraries. > > While the hash function is pretty trivial, it really didn't matter > > whether you divided 2^n, 2^n-1 or 'the prime below 2^n' some hash > > chains were always long. > > Even in the unlikely event of bad luck and everything bunches up > together, we change theh hash function (through hash_rnd) every > time we resize so you would expect things to even out after the > resize event. > > A rehash is also automatically triggered if the worst-case chain > length exceeds 16. 16!? that's crap, use a decent hash function and 3-5 should be your worst upper bound.