On 6/2/21 12:56 AM, Tao Klerks wrote: > Hi folks, > > I'm learning to use Partial Clone, and finding a behavior that I don't > know how to interpret or investigate: > > Under some circumstances, doing a plain "git fetch <remote>" on a > filtered repo results in a very long (6-30 min?) wait, during which I > can see the following command being executed in the background: > > /usr/libexec/git-core/git rev-list --objects --stdin > --exclude-promisor-objects --not --all --quiet --alternate-refs > > So far, I have noted this happening under two distinct circumstances: > * Anytime I try to fetch on a filtered repo with a git 2.23 client - > shorter pause > * When I try to fetch with a recent (2.31) client in a repo where one > large packfile has no *.promisor file (but the others do, and the > remote I am fetching from has promisor=true) - looong pause This makes me think that there was a bug fix for this situation but the fix requires doing extra work. To help track this down, could you re-run the scenario with GIT_TRACE2_PERF=1 which will give the full Git process stack as we reach that rev-list call. > Can anyone explain what this rev-list call intends, and/or any hints > as to how I could see what the stdin content being fed to it from the > parent process actually is? > > For background, I ended up in the "missing promisor file" situation by > trying to be (too?) clever about the blobs present in my clone: I > cloned unfiltered shallow to a certain depth with certain refspecs, > then added the promisor and filter config, and finally fetched with > "--unshallow". This produced exactly the blob-population state I > intended, but meant the original first packfile had no ".promisor" > file. This is the critical point: you first cloned without a filter, and then converted the remote to a promisor remote without marking the pack-files you received from that remote as promisor pack-files. That means that Git needs to do some work to discover which objects are reachable from promisor packs or not, and that extra work is slowing you down. Partial clone is designed to work where every remote is a promisor remote, and always has been so. Any deviation from that norm is venturing into uncharted territory and will have friction like this. Another similar issue comes when you have multiple remotes and one of them is a promisor remote and another is not. The general advice right now is to use partial clone only if you will use it for all remotes across the entire existence of the repo. Part of the difficulty here is that once you download that first pack-file from the remote, Git has no way of knowing that the pack came from that source or was created in another way. We have no way to be sure that we can "upgrade" the remote in an automated process. This does make me wonder what happens when Git repacks objects created locally and then starts fetching from a promisor remote. There are some challenges here, for sure. Most likely also some potential gains, but it is unlikely to create a seamless experience for what you are trying to do. Thanks, -Stolee