On Tue, Mar 30, 2021 at 2:19 PM Elijah Newren <newren@xxxxxxxxx> wrote: > > On Tue, Mar 30, 2021 at 11:58 AM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > > > Jeff King <peff@xxxxxxxx> writes: > > > > > ... though if we go that route, I suspect we ought to be adding both the > > > original _and_ the replacement. > > > > So "branch --contains X" would ask "which of these branches reach X > > or its replacement?" and "branch --no-contains X" would ask "which > > of these do not reach X nor its replacement?" --- I guess the result > > is still internally consistent (meaning: any and all branches fall > > into either "--contains X" or "--no-contains X" camp). > > I'm not so sure about this interpretation. Based on the documentation > in git-replace(1): > > Replacement references will be used by default by all Git commands > except those doing reachability traversal (prune, pack transfer and > fsck). > > I would have thought that > > * "branch --contains X" would ask "which of these branches reach X's > replacement?" > * "git --no-replace-objects branch --contains X" would ask "which of > these branches reach X?" > > and if folks really wanted to check whether either X or its > replacement were reachable then they'd need to run both commands. > > The only place outside of reachability traversal where I think it > makes sense for a command to distinguish between X being a replace ref > for Y and Y itself is in `git log` where it can show the "replaced" > moniker. > > > > I'm not entirely sure this is a good direction, though. > > > > > >> and possibly worse, if I create a new branch based on it and use it: > > >> > > >> $ git branch foobar deadbeefdeadbeefdeadbeefdeadbeefdeadbeef > > >> $ git checkout foobar > > >> $ echo stuff >empty > > >> $ git add empty > > >> $ git commit -m more > > >> > > >> then it's clear that branch created foobar pointing to the replaced > > >> object rather than the replacement object -- despite the fact that the > > >> replaced object doesn't even exist within this repo: > > >> > > >> $ git cat-file -p HEAD > > >> tree 18108bae26dc91af2055bc66cc9fea278012dbd3 > > >> parent deadbeefdeadbeefdeadbeefdeadbeefdeadbeef > > >> author Elijah Newren <newren@xxxxxxxxx> 1617083739 -0700 > > >> committer Elijah Newren <newren@xxxxxxxxx> 1617083739 -0700 > > >> > > >> more > > > > > > Yeah, that's pretty horrible. > > > > I am not sure. As you analize below, the replace mechanism is about > > telling Git: when anybody refers to deadbeef, use its replacement if > > defined instead. > > > > And one of the points in the mechanism is to allow to do so even > > retroactively, so the HEAD object there may be referring to deadbeef > > that may not exist does not matter, as long as the object that is to > > replace deadbeef is available. If not, that is a repository > > corruption. After all, the commit object you cat-file'ed may have > > been created by somebody else in a separate repository that had > > deadbeef before they were told by Elijah that the object is obsolete > > and to be replaced by something else (Git supports distributed > > development) and then pulled into Elijah's repository, and we should > > be prepared to seeing "parent deadbeef" in such a commit. As long as > > replacement happens when accessing the contents, that would be OK. > > > > So, I do not see it as "pretty horrible", but I may be missing > > something. > > I think you're focusing on git commit, or perhaps on git checkout. > I'm focusing on git branch; what it did does not seem fine to me. > Using your own words: > > "the replace mechanism is about telling Git: when anybody refers to > deadbeef, use its replacement if defined instead." > > git branch didn't do that; it put deadbeef into refs/heads/foobar. Perhaps I should also add why it not only breaks expectations, but why that broken expectation causes problems: * People tend to have commit hashes stored in lots of weird placed -- bug trackers, reports, emails, etc. These tend to be important for a short time period, but the number of these references make it harder for folks who want to rewrite history to fix various past issues (very large binary blobs and other misdeeds). * filter-repo uses replace refs to provide users with a way to access new commits using old commit hashes, to help them through this transition period. * Additional refs (especially one for every commit) will cause some slowness. So it's nice to be able to provide these replace refs for short term transition, but tell users they can simply delete the replace refs when they no longer need them without consequence. The fact that git branch puts deadbeef into refs/heads/foobar, leads to a chain where new commits now rely on replacement refs. In the best case, others will not be able to pull from this user and the user will not be able to push the new commits anywhere -- and that user will have some work to do to rewrite (rebase?) the commits appropriately. In the worst case, the users do succeed in distributing this new history, and now all users everywhere will be mandated to keep all replace refs for all time (or at least until the next major repository rewrite)...