Hi Taylor, On 26 Feb 2022, at 15:30, Taylor Blau wrote: > On Sat, Feb 26, 2022 at 03:19:11PM -0500, John Cai wrote: >> Thanks for bringing this up again. I meant to write back regarding what you raised >> in the other part of this thread. I think this is a valid concern. To attain the >> goal of offloading certain blobs onto another server(B) and saving space on a git >> server(A), then there will essentially be two steps. One to upload objects to (B), >> and one to remove objects from (A). As you said, these two need to be the inverse of each >> other or else you might end up with missing objects. > > Do you mean that you want to offload objects both from a local clone of > some repository, _and_ the original remote it was cloned from? yes, exactly. The "another server" would be something like an http server, OR another remote which hosts a subset of the objects (let's say the large blobs). > > I don't understand what the role of "another server" is here. If this > proposal was about making it easy to remove objects from a local copy of > a repository based on a filter provided that there was a Git server > elsewhere that could act as a promisor remote, than that makes sense to > me. > > But I think I'm not quite understanding the rest of what you're > suggesting. Sorry for the lack of clarity here. The goal is to make it easy for a remote to offload a subset of its objects to __another__ remote (either a Git server or an http server through a remote helper). > >>> My other concern was around what guarantees we currently provide for a >>> promisor remote. My understanding is that we expect an object which was >>> received from the promisor remote to always be fetch-able later on. If >>> that's the case, then I don't mind the idea of refiltering a repository, >>> provided that you only need to specify a filter once. >> >> Could you clarify what you mean by re-filtering a repository? By that I assumed >> it meant specifying a filter eg: 100mb, and then narrowing it by specifying a >> 50mb filter. > > I meant: applying a filter to a local clone (either where there wasn't a > filter before, or a filter which matched more objects) and then removing > objects that don't match the filter. > > But your response makes me think of another potential issue. What > happens if I do the following: > > $ git repack -ad --filter=blob:limit=100k > $ git repack -ad --filter=blob:limit=200k > > What should the second invocation do? I would expect that it needs to do > a fetch from the promisor remote to recover any blobs between (100, 200] > KB in size, since they would be gone after the first repack. > > This is a problem not just with two consecutive `git repack --filter`s, > I think, since you could cook up the same situation with: > > $ git clone --filter=blob:limit=100k git@xxxxxxxxxx:git > $ git -C git repack -ad --filter=blob:limit=200k > > I don't think the existing patches handle this situation, so I'm curious > whether it's something you have considered or not before. I have not-will have to think through this case, but this sound similar to what [1] is about. is about. > > (Unrelated to the above, but please feel free to trim any quoted parts > of emails when responding if they get overly long.) > > Thanks, > Taylor Thanks John 1. https://lore.kernel.org/git/pull.1138.v2.git.1645719218.gitgitgadget@xxxxxxxxx/