On 6/16/2022 2:07 AM, Jeff King wrote: > On Wed, Jun 15, 2022 at 02:17:58PM -0400, Derrick Stolee wrote: > >> On 6/15/2022 1:40 PM, Richard Oliver wrote: >>> On 15/06/2022 05:00, Jeff King wrote: >> >>>> So it is not just lookup, but actual tree walking that is expensive. The >>>> flip side is that you don't have to store a complete separate list of >>>> the promised objects. Whether that's a win depends on how many local >>>> objects you have, versus how many are promised. >> >> This is also why blobless (or blob-size filters) are the recommended way >> to use partial clone. It's just too expensive to have tree misses. > > I agree that tree misses are awful, but I'm actually talking about > something different: traversing the local trees we _do_ have in order to > find the set of promised objects. Which is worse for blob:none, because > it means you have more trees locally. :) Ah, I misread your email. I agree that walking trees is far too expensive to do just to find an object type. > Try this with a big repo like linux.git: > > git clone --no-local --filter=blob:none linux.git repo > cd repo > > # this is fast; we mark the promisor trees as UNINTERESTING, so we do > # not look at them as part of the traversal, and never call > # is_promisor_object(). > time git rev-list --count --objects --all --exclude-promisor-objects > > # but imagine we had a fixed mktree[1] that did not fault in the blobs > # unnecessarily, and we made a new tree that references a promised > # blob. > tree=$(git ls-tree HEAD~1000 | grep Makefile | git mktree --missing) > commit=$(echo foo | git commit-tree -p HEAD $tree) > git update-ref refs/heads/foo $commit > > # this is now slow; even though we only call is_promisor_object() > # once, we have to open every single tree in the pack to find it! > time git rev-list --count --objects --all --exclude-promisor-objects > > Those rev-lists run in 1.7s and 224s respectively. Ouch! This is exactly the reason I thought just asking for the objects directly is faster than scanning all the packs. Thanks for giving concrete numbers that support that assumption. Thanks, -Stolee