Re: [PATCH] hashmap: add API to disable item counting when threaded

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> writes:

> On Sat, Sep 02, 2017 at 01:31:19AM +0200, Johannes Schindelin wrote:
>
>> BTW this made me think that we may have a problem in our code since
>> switching from my original hashmap implementation to the bucket one added
>> in 6a364ced497 (add a hashtable implementation that supports O(1) removal,
>> 2013-11-14): while it is not expected that there are many collisions, the
>> "grow_at" logic still essentially assumes the number of buckets to be
>> equal to the number of hashmap entries.
>
> I'm confused about what the problem is. If I am reading the code
> correctly, "size" is always the number of elements and "grow_at" is the
> table size times a load factor. Those are the same numbers you'd use to
> decide to grow in an open-address table.
>
> It's true that this does not take into account the actual number of
> collisions we see (or the average per bucket, or however you want to
> count it). But generally nor do open-address schemes (and certainly our
> other hash tables just use load factor to decide when to grow).

Are we comparing the hashmap.[ch] with the hash.[ch] added in
9027f53c ("Do linear-time/space rename logic for exact renames",
2007-10-25)?  I am a bit confused because Johannes calls it "my"
original.

Unless the real person in this discussion thread sending the
messages under Johannes's name is Linus, that is ;-).  Or maybe the
"original" being compared is something other than the series with
6a364ced497 replaced with its hashmap.[ch]?

In any case, I do think your reading of the code is correct in that
the comparison between size and grow-at/shrink-at is done correctly
with the true load factor of the table, not how many buckets out of
the possible buckets are filled.  

Old one used to grow at 50% full and never shrunk it, but the
current one grows at 80% and shrinks at a bit below 40%; I agree
with Dscho's feeling (in part not quoted above) that 50% vs 80%
doesn't seem to have been backed by any numbers, but optimizing the
load factor is outside the scope of this series, I would think.

6a364ced ("add a hashtable implementation that supports O(1)
removal", 2013-11-14) credits less frequent resizing for gain of
insert performance, but my hunch is that the need for frequent
resizing in the version before it primarily comes from the fact that
the table started empty (as opposed to having an initial size of 64,
which is what the current implementation uses).



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux