Hi Christian, I think my objections may be based on a misunderstanding of John and your original proposal. From reading [1], it seemed to me like a required step of this proposal was to upload the objects you want to filter out ahead of time, and then run `git repack -ad --filter=...`. So my concerns thus far have been around the lack of cohesion between (1) the filter which describes the set of objects uploaded to the HTTP server, and (2) the filter used when re-filtering the repository. If (1) and (2) aren't inverses of each other, then in the case where (2) leaves behind an object which wasn't caught by (1), we have lost that object. If instead the server used in your script at [1] is a stand-in for an ordinary Git remote, that changes my thinking significantly. See below for more details: On Tue, Feb 22, 2022 at 06:11:11PM +0100, Christian Couder wrote: > > > > I share the same concern as Robert and Stolee do. But I think this issue > > > > goes deeper than just naming. > > > > > > > > Even if we called this `git repack --delete-filter` and only ran it with > > > > `--i-know-what-im-doing` flag, we would still be leaving repository > > > > corruption on the table, just making it marginally more difficult to > > > > achieve. > > > > > > My opinion on this is that the promisor object mechanism assumes by > > > design that some objects are outside a repo, and that this repo > > > shouldn't care much about these objects possibly being corrupted. > > > > For what it's worth, I am fine having a mode of repack which allows us > > to remove objects that we know are stored by a promisor remote. But this > > series doesn't do that, so users could easily run `git repack -d > > --filter=...` and find that they have irrecoverably corrupted their > > repository. > > In some cases we just know the objects we are removing are stored by a > promisor remote or are replicated on different physical machines or > both, so you should be fine with this. I am definitely OK with a convenient way to re-filter your repository locally so long as you know that the objects you are filtering out are available via some promisor remote. But perhaps I have misunderstood what this proposal is for. Reading through John's original cover letter and the link to your demo script, I understood that a key part of this was being able to upload the pack of objects you were about to filter out of your local copy to some server (not necessarily Git) over HTTP. My hesitation so far has been based on that understanding. Reading these patches, I don't see a mechanism to upload objects we're about to expunge to a promisor remote. But perhaps I'm misunderstanding: if you are instead assuming that the existing set of remotes can serve any objects that we deleted, and this is the way to delete them, then I am OK with that approach. But I think either way, I am missing some details in the original proposal that would have perhaps made it easier for me to understand what your goals are. In any case, this patch series doesn't seem to correctly set up a promisor remote for me, since doing the following on a fresh clone of git.git (after running "make"): $ bin-wrappers/git repack -ad --filter=blob:none $ bin-wrappers/git fsck results in many "missing blob" and "missing link" lines out of output. (FWIW, I think what's missing here is correctly setting up the affected remote(s) as promisors and indicating that we're now in a filtered setting when going from a full clone down to a partial one.) > If you are not fine with this because sometimes a user might use it > without knowing, then why are you ok with commands deleting refs not > checking that there isn't a regular repack removing dangling objects? I'm not sure I totally understand your question, but my general sense has been "because we typically make it difficult / impossible to remove reachable objects". Thanks, Taylor [1]: https://gitlab.com/chriscool/partial-clone-demo/-/blob/master/http-promisor/server_demo.txt#L47-52