Re: Should --update-refs exclude refs pointing to the current HEAD?

Stefan Haller <lists@xxxxxxxxxxxxxxxx> · Tue, 12 Mar 2024 10:28:48 +0100

On 09.03.24 04:28, Elijah Newren wrote:
>> That would be the wrong way round. I want to leave the original branch
>> untouched, make a new branch and rebase that away from the original.
> 
> Ah, sorry for misunderstanding.  Still, though, what's wrong with running
>     git branch -f original_branch original_branch@{1}
> after the operation?

It's unintuitive. Users don't think this way, at least as far as I have
observed them (and I don't think this way myself). Also, for many users
the branch{n} syntax to access previous reflog entries is an advanced
concept that they are not familiar with.

> Also, since you're not using the git cli directly but going through
> lazygit, isn't this something you can just include in lazygit as part
> of whatever overall operation is creating the new copy branch and
> rebasing it?

Yes, there are various workarounds that I could build into lazygit.
Right now I'm planning to have lazygit check whether any branch heads
point at any of the commits in the range of commits that is being
rebased except for the head, and if not, add --no-update-refs. This will
solve it well enough for most cases, and it doesn't bother me too much
that I have to add this additional complexity to our code. I was just
hoping that cli users typing

  git checkout -b original-branch copy
  git rebase --onto devel main

would get the same improvement. It bothers me a bit that we have to
build clients around the git cli that make it perform better than the
git cli does.

>> Wait, now you are really turning things around. You make it sound like
>> my proposal is responsible for what you call a "bug" here. It's not, git
>> already behaves like this (and you may or may not consider that a
>> problem), and my proposal doesn't change anything about it. It doesn't
>> "fix" it, that's right (and this is what I referred to when I said "I'm
>> fine with it"), but it doesn't make it any worse either.
> 
> Ah, I see where I was unclear as well, and my lack of clarity stemmed
> from not understanding your proposal.  To try to close the loop, allow
> me to re-translate your "This is a good point, but..it never happens
> in practice for me." paragraph, the way I _erroneously_ read it at the
> time:
> 
> """
> For my new proposal, the case you bring up is a good point.  But it
> doesn't happen for me, so I propose to leave it as undefined behavior.
> [As undefined behavior, anyone that triggers it is likely to get
> behavior they deem buggy and not like it, but that won't affect me.]
> """
> 
> Now, obviously, that doesn't sound quite right.  I knew it at the
> time, but reading and re-reading your paragraph, it kept coming out
> that way for me.  Thus I tried to ask if that's what you really meant,
> and apologizing in advance if I was mis-reading.
> 
> Anyway, with the extra explanation in your latest email, I now see
> that you weren't leaving it undefined, but your proposal wasn't clear
> to me either in that paragraph or in combination with the rest of your
> previous email.  Sorry for my misunderstanding.

I think it's worth clarifying this again, and see whether "undefined
behavior" is the right term to use here. Again, this discussion has
improved my own understanding of the matter, so let me try to spell it
out again:

The fundamental underlying problem is that when we encounter two
branches pointing at the same commit in a rebase, git has no way to
distinguish whether this is because there's an "empty" branch in a stack
(either at the top or in the middle), or whether one branch is a copy of
the other. In the first case, both branches should be updated by "rebase
--update-ref", in the second case only one of them should, since the
other is not part of the stack. Since there's no way for git to tell for
sure, it can only guess which of the two was meant by the user, with a
heuristic that hopefully guesses right in the majority of cases. I think
it would be wrong to call it a "bug" (or an "edge case bug" like you did
earlier) if it guesses wrong in a particular scenario.

Right now, it _always_ guesses in favor of the stack, so it never
considers a branch to be a copy. For my own use of git, and of my
co-workers as I have observed them in pairing sessions, this is almost
always wrong. I have never encountered an empty branch in a stack, as
far as I remember, but I am encountering copies of branches fairly
often, so I'd like to improve the heuristic to make git guess right in
these cases. Note that this is definitely not a 5% thing as in your
three-way merging example; I can't provide any hard numbers of course,
but it feels much more like the classical 80/20 rule to me (where my
proposal would improve it for 80% of the cases, to be clear).

So, I concluded that copies are much more frequent than empty branches
in a stack, so it would make sense for me to turn the heuristic around
and always guess in favor of a copied branch. The problem is that we can
only do this for the tip of the branch, because only in that case can we
tell which branch is the copy (the one being rebased) and which one is
the original that should be left alone. For branches in the middle of
the stack we just can't tell, so we have to guess in favor of an empty
branch in a stack and update both refs, since otherwise we'd have to
randomly pick one of them to update and leave the other one alone,
risking to break the stack this way.

So that's really where my proposal comes from: guess in favor of a
copied branch only at the tip but not in the middle; not because we only
want it at the tip, but just because we only can at the tip.

But fortunately, it is in fact true that I almost never create a copy of
a branch in the middle of a stack, but then I almost never have empty
branches in the middle of a stack either, so it doesn't really matter to
me which way the heuristic guesses in this case.

I hope this clarifies it a bit more.

Having written all this, I do realize that it's probably too complex to
explain to users (not the behavior itself, which is fairly simple, but
the rationale behind it).

-Stefan