On 15/06/2022 05:00, Jeff King wrote: > On Tue, Jun 14, 2022 at 08:35:16PM -0400, Taylor Blau wrote: > >> On Tue, Jun 14, 2022 at 01:27:18PM -0400, Derrick Stolee wrote: >>>> Did you have any other sort of performance test in mind? The remotes we >>>> typically deal with are geographically far away and deal with a high volume >>>> of traffic so we're keen to move behaviour to the client where it makes sense >>>> to do so. >>> >>> I guess I wonder how large your promisor pack-files are in this test, >>> since your implementation depends on for_each_packed_object(), which >>> should be really inefficient if you're actually dealing with a large >>> partial clone. >> >> I had the same thought. Storing data available in the promisor packs >> into an oid_map is going to be expensive if there are many such objects. >> >> Is there a reason that we can't introduce a variant of >> find_kept_pack_entry() that deals only with .promisor packs and look >> these things up as-needed? > > It's much worse than that. The promisor mechanism is fundamentally very > inefficient in runtime, optimizing instead for size. Imagine I have a > partial clone and I retrieve tree X, which points to a blob Y that I > don't get. I have X in a promisor pack, and asking about it is > efficient. But if I want to know about Y, I have no data structure > mentioning Y except the tree X itself. So to enumerate all of the > promisor edges, I have to walk all of the trees in the promisor pack. > > So it is not just lookup, but actual tree walking that is expensive. The > flip side is that you don't have to store a complete separate list of > the promised objects. Whether that's a win depends on how many local > objects you have, versus how many are promised. > > But it would be possible to cache the promisor list to make the tradeoff > separately. E.g., do the walk over the promisor trees once (perhaps at > pack creation time), and store a sorted list of fixed-length (oid, type) > records that could be binary searched. You could even put it in the > .promisor file. :) > > -Peff I like the idea of caching the promisor list at pack creation time; I'll start work on a patch set that implements this. Meanwhile, is it worth considering a '--promised-as-missing' option (or a config option) for invocations such as 'mktree --missing' that prevents promised objects being faulted-in? Currently, the only reliable way that I've found to prevent 'mktree --missing' faulting-in promised objects is to remove the remote. Such an option could either set the global variable 'fetch_if_missing' to '0' or could ensure 'OBJECT_INFO_SKIP_FETCH_OBJECT' is passed appropriately. Cheers, Richard