Re: Bug report: git branch behaves as if --no-replace-objects is passed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 30, 2021 at 2:19 PM Elijah Newren <newren@xxxxxxxxx> wrote:
>
> On Tue, Mar 30, 2021 at 11:58 AM Junio C Hamano <gitster@xxxxxxxxx> wrote:
> >
> > Jeff King <peff@xxxxxxxx> writes:
> >
> > > ... though if we go that route, I suspect we ought to be adding both the
> > > original _and_ the replacement.
> >
> > So "branch --contains X" would ask "which of these branches reach X
> > or its replacement?" and "branch --no-contains X" would ask "which
> > of these do not reach X nor its replacement?" --- I guess the result
> > is still internally consistent (meaning: any and all branches fall
> > into either "--contains X" or "--no-contains X" camp).
>
> I'm not so sure about this interpretation.  Based on the documentation
> in git-replace(1):
>
>        Replacement references will be used by default by all Git commands
>        except those doing reachability traversal (prune, pack transfer and
>        fsck).
>
> I would have thought that
>
> * "branch --contains X" would ask "which of these branches reach X's
> replacement?"
> * "git --no-replace-objects branch --contains X" would ask "which of
> these branches reach X?"
>
> and if folks really wanted to check whether either X or its
> replacement were reachable then they'd need to run both commands.
>
> The only place outside of reachability traversal where I think it
> makes sense for a command to distinguish between X being a replace ref
> for Y and Y itself is in `git log` where it can show the "replaced"
> moniker.
>
> > > I'm not entirely sure this is a good direction, though.
> > >
> > >> and possibly worse, if I create a new branch based on it and use it:
> > >>
> > >>     $ git branch foobar deadbeefdeadbeefdeadbeefdeadbeefdeadbeef
> > >>     $ git checkout foobar
> > >>     $ echo stuff >empty
> > >>     $ git add empty
> > >>     $ git commit -m more
> > >>
> > >> then it's clear that branch created foobar pointing to the replaced
> > >> object rather than the replacement object -- despite the fact that the
> > >> replaced object doesn't even exist within this repo:
> > >>
> > >>     $ git cat-file -p HEAD
> > >>     tree 18108bae26dc91af2055bc66cc9fea278012dbd3
> > >>     parent deadbeefdeadbeefdeadbeefdeadbeefdeadbeef
> > >>     author Elijah Newren <newren@xxxxxxxxx> 1617083739 -0700
> > >>     committer Elijah Newren <newren@xxxxxxxxx> 1617083739 -0700
> > >>
> > >>     more
> > >
> > > Yeah, that's pretty horrible.
> >
> > I am not sure.  As you analize below, the replace mechanism is about
> > telling Git: when anybody refers to deadbeef, use its replacement if
> > defined instead.
> >
> > And one of the points in the mechanism is to allow to do so even
> > retroactively, so the HEAD object there may be referring to deadbeef
> > that may not exist does not matter, as long as the object that is to
> > replace deadbeef is available.  If not, that is a repository
> > corruption.  After all, the commit object you cat-file'ed may have
> > been created by somebody else in a separate repository that had
> > deadbeef before they were told by Elijah that the object is obsolete
> > and to be replaced by something else (Git supports distributed
> > development) and then pulled into Elijah's repository, and we should
> > be prepared to seeing "parent deadbeef" in such a commit.  As long as
> > replacement happens when accessing the contents, that would be OK.
> >
> > So, I do not see it as "pretty horrible", but I may be missing
> > something.
>
> I think you're focusing on git commit, or perhaps on git checkout.
> I'm focusing on git branch; what it did does not seem fine to me.
> Using your own words:
>
> "the replace mechanism is about telling Git: when anybody refers to
> deadbeef, use its replacement if defined instead."
>
> git branch didn't do that; it put deadbeef into refs/heads/foobar.

Perhaps I should also add why it not only breaks expectations, but why
that broken expectation causes problems:

* People tend to have commit hashes stored in lots of weird placed --
bug trackers, reports, emails, etc.  These tend to be important for a
short time period, but the number of these references make it harder
for folks who want to rewrite history to fix various past issues (very
large binary blobs and other misdeeds).

* filter-repo uses replace refs to provide users with a way to access
new commits using old commit hashes, to help them through this
transition period.

* Additional refs (especially one for every commit) will cause some
slowness.  So it's nice to be able to provide these replace refs for
short term transition, but tell users they can simply delete the
replace refs when they no longer need them without consequence.


The fact that git branch puts deadbeef into refs/heads/foobar, leads
to a chain where new commits now rely on replacement refs.  In the
best case, others will not be able to pull from this user and the user
will not be able to push the new commits anywhere -- and that user
will have some work to do to rewrite (rebase?) the commits
appropriately.  In the worst case, the users do succeed in
distributing this new history, and now all users everywhere will be
mandated to keep all replace refs for all time (or at least until the
next major repository rewrite)...



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux