On Thu, Apr 02, 2020 at 08:40:35PM +0200, René Scharfe wrote: > > And I didn't even have to pre-size the table. This really makes me > > wonder if there's some silly inefficiency in khash which we could be > > addressing. Or maybe open-addressing really does lose to chaining here, > > but I think we keep the load factor low enough that it should be a win. > > Or we're just unlucky. I tried to find the difference between khash > with and without presizing using callgrind, but came up empty. It did > reveal that fast-import spends 70% of its cycles in a million memset() > calls issued (indirectly) by git_deflate_init() which in turn is called > by store_object() which is called from parse_and_store_blob(), though. I think that 70% is outsized in this case because we're dumping millions of 4-byte blobs. In a real repo you'd have larger blobs, as well as actual trees and commits pulling them together. > Why is the won second when handling 1M objects not showing in its > output? I suspect it's because it uses its custom allocator to gather > its data. So I ran the test with jemalloc2 preloaded: > > nr_objects master khash khash+preload > 2^20 0m5.812s 0m5.600s 0m5.604s > 2^21 0m12.913s 0m10.884s 0m10.357s > 2^22 0m31.257s 0m21.461s 0m21.031s > 2^23 1m20.904s 0m40.181s 0m42.607s > 2^24 3m59.201s 1m21.104s 1m23.814s > > My measurements are noisy, but my point is simply that with a different > allocator you'd not even have seen any slowdown when switching to khash. Yeah, that makes sense. I still prefer the hashmap solution for its lack of pointer hackery, given that it seems to perform as well or better. I'll send a cleaned-up patch in a moment. > > struct object_entry { > > struct pack_idx_entry idx; > > - struct object_entry *next; > > + struct hashmap_entry ent; > > That uses 16 bytes more memory per entry on x64 than khash would. > That's 256MB for 2^24 objects -- not ideal, but bearable, I guess. Isn't it 8? We're dropping the old pointer and replacing it with the "next" pointer in hashmap_entry, plus our 4-byte hash code (which likely gets padded to 8). I think it's probably OK in practice. > > +static int object_entry_hashcmp(const void *map_data, > > + const struct hashmap_entry *eptr, > > + const struct hashmap_entry *entry_or_key, > > + const void *keydata) > > +{ > > + const struct object_id *oid = keydata; > > + const struct object_entry *e1, *e2; > > + > > + e1 = container_of(eptr, const struct object_entry, ent); > > That's nicer that the pointer alchemy in the khash conversion for sure. > > But why const? Can const change the layout of a structure? Scary. No, I don't think it can. I mostly copied the "const" from the other container_of() hashmap sites. I don't think it matters in practice, because we're assigning the result to a const pointer anyway. But it seems a little cleaner not to momentarily cast away the constness even inside the macro. -Peff