On Thu, Nov 28, 2019 at 12:42:02AM +0000, Eric Wong wrote: > > Add a hashmap containing the packfile names as we load them so that > > the average runtime cost of checking for already-loaded packs becomes > > constant. > > Btw, would you have time to do a comparison against khash? > > AFAIK hashmap predates khash in git; and hashmap was optimized > for removal. Removals don't seem to be a problem for pack > loading. Actually, they came around simultaneously. I think hashmap.[ch] was mostly a response to our open-coded hashes, like the one in object.c (which still uses neither of the reusable forms!). Those didn't handle removal at all. khash does handle removal, though you pay a price in tombstone entries until the next resize operation. > I'm interested in exploring the removing of hashmap entirely in > favor of khash to keep our codebase smaller and easier-to-learn. > khash shows up more in other projects, and ought to have better > cache-locality. I have been tempted to push for that, too. Every timing I have ever done shows khash as faster (though for a trivial use like this one, I would be quite surprised if it mattered either way). My hesitation is that khash can be harder to debug because of the macro implementation. But I have rarely needed to look beneath its API. -Peff