On Thu, Apr 06, 2017 at 03:48:07PM -0700, Stefan Beller wrote: > On Thu, Apr 6, 2017 at 1:37 PM, <git@xxxxxxxxxxxxxxxxx> wrote: > > From: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> > > > > Teach traverse_trees_recursive() to not do redundant ODB > > lookups when both directories refer to the same OID. > > And the reason for this is that omitting the second lookup > saves time, i.e. a lookup in the ODB of a sufficiently large > repo is slow. > > My kneejerk line of thinking: > * yes, it sounds good to reduce the number of ODB accesses. > * But if we consider ODB lookups to be slow and we perform > a structured access, how about a cache in front of the ODB? > * We already have that! (sort of..) 9a414486d9 (lookup_object: > prioritize recently found objects, 2013-05-01) > * Instead of improving the caching, maybe change the > size of the problem: We could keep the objects of different types > in different hash-tables. I don't think this is using lookup_object() at all, though. It goes straight to fill_tree_descriptor(), which will read the object contents from disk. So our time is going to: 1. Finding the object in the odb (ideally a binary search in a single pack index, but less optimal when there are many packs). 2. Reconstructing the object. This means zlib-inflating from disk, but it may also mean delta reconstruction. I think there _are_ some caches at play here, though, when you look up the same tree back to back. The pack-mru means that we'll always look in the correct pack first. And in theory the delta base cache means that we'll already have the whole thing reconstructed in memory (though I have often been confused by exactly when we put items into that cache, so I might be wrong). So in theory, this is not all that different than the "just allocate and copy the bytes" optimization that's happening here (though I'm not surprised that doing it at a higher level can produce some speedup). I think the more interesting optimization is "just use the same buffer without bothering to copy", which is hard for the low-level code to do (since it doesn't know what lifetime the caller is expecting). > object.c has its own hash table, I presume for historical and > performance reasons, this would be split up to multiple hash > tables. So I don't think lookup_object() is really relevant here. But I'm also not sure that multiple hash tables would really buy us much. In theory hash tables are O(1), so multiple smaller tables doesn't help (and might hurt, since now we have four O(1) lookups to do). Of course that's a minor fiction because of collisions, but we keep the load factor on the object.c table pretty low (and that's why things like quadratic probing and cuckoo hashing never showed great improvement). -Peff