On 6/1/21 6:24 AM, Tao Klerks wrote: > Hi folks, > > I'm trying to deepen my understanding of the Partial Clone > functionality for a possible deployment at scale (with a large-ish > 13GB project where we are using date-based shallow clones for the time > being), and one thing that I can't get my head around yet is how you > "unfilter" an existing filtered clone. > > The gitlab intro document > (https://docs.gitlab.com/ee/topics/git/partial_clone.html#remove-partial-clone-filtering) > suggests that you need to get the full list of missing blobs, and pass > that into a fetch...: > > git fetch origin $(git rev-list --objects --all --missing=print | grep > -oP '^\?\K\w+') I think the short answer is to split your "git rev-list" call into batches by limiting the count. Perhaps pipe that command to a file and then split it into batches of "reasonable" size. Your definition of "reasonable" may vary, so try a few numbers. > The official doc at https://git-scm.com/docs/partial-clone makes no > mention of plans or goals (or non-goals) related to this "unfiltering" > - is it something that we should expect a story to emerge around? The design is not intended for this kind of "unfiltering". The feature is built for repositories where doing so would be too expensive (both network time and disk space) to be valuable. Also, asking for the objects one-by-one like this is very inefficient on the server side. A fresh clone can make use of existing delta compression in a way that this type of request cannot (at least, not easily). You _would_ be better off making a fresh clone and then adding that pack-file to your .git/objects/pack directory of the repository you want. Could you describe more about your scenario and why you want to get all objects? Thanks, -Stolee