Removing Partial Clone / Filtered Clone on a repo

Tao Klerks <tao@xxxxxxxxxx> · Tue, 1 Jun 2021 12:24:23 +0200

Hi folks,

I'm trying to deepen my understanding of the Partial Clone
functionality for a possible deployment at scale (with a large-ish
13GB project where we are using date-based shallow clones for the time
being), and one thing that I can't get my head around yet is how you
"unfilter" an existing filtered clone.

The gitlab intro document
(https://docs.gitlab.com/ee/topics/git/partial_clone.html#remove-partial-clone-filtering)
suggests that you need to get the full list of missing blobs, and pass
that into a fetch...:

git fetch origin $(git rev-list --objects --all --missing=print | grep
-oP '^\?\K\w+')

In my project's case, that would be millions of blob IDs! I tested
this with a path-based filter to rev-list, to see what getting 30,000
blobs might look like, and it took a looong while... I don't
understand much about the negotiation process, but I have to assume
there is a fixed per-blob cost in this scenario which is *much* higher
than in a "regular" fetch or clone.

Obviously one answer is to throw away the repo and start again with a
clean unfiltered clone... But between repo-local config, project
settings in IDEs / external tools, and unpushed local branches, this
is an awkward thing to ask people to do.

I initially thought it might be possible to add an extra remote
(without filter / promisor settings), mess with the negotiation
settings to make the new remote not know anything about what's local,
and then get a full set of refs and their blobs from that remote...
but I must have misunderstood how the negotation-tip stuff works
because I can't get that to do anything (it always "sees" my existing
refs and I just get the new remote's refs "for free" without object
transfer).

The official doc at https://git-scm.com/docs/partial-clone makes no
mention of plans or goals (or non-goals) related to this "unfiltering"
- is it something that we should expect a story to emerge around?

Thanks,
Tao Klerks