Re: [PATCH] rebase --autosquash: fix a potential segfault

Johannes Schindelin <Johannes.Schindelin@xxxxxx> · Sat, 9 May 2020 01:45:03 +0200 (CEST)

Hi Peff,

On Thu, 7 May 2020, Jeff King wrote:

> On Wed, May 06, 2020 at 11:35:48PM +0200, Johannes Schindelin wrote:
>
> > > >> > +				next[i] = next[i2];
> > > >> > +				next[i2] = i;
> > > >> > +				continue;
> > > >> > +			}
> > > >>
> > > >> I do have one question, though. What happens if we add a second
> > > >> fixup-of-a-fixup?
> > > >
> > > > Thanks for asking this question, I was a little curious about it, too.
> > >
> > > Interesting that three people looked at the same patch and asked the
> > > same question in different ways ;-)
> >
> > Indeed!
> >
> > I am very grateful, as I had missed that, and it helped me figure out a
> > better way to do it, and v2 looks a lot nicer, too.
>
> OK, so your v2 addresses that. Does that mean it was broken in v1?

Yes.

> If so, then why didn't my test reveal it?

Let's disect this:

 i  hash oneline
#0  1234 foo
#1  5678 !fixup foo
#2  abcd !fixup 5678
#3  dbaf !fixup 5678

Let's follow the original code, i.e. before my v1:

When #1 is processed, i.e. when `i == 1`, it finds `i2 == 0` as target. So
it sets `next[0]` as well as `tail[0]` to 1.

Then #2 is processed, i.e. `i == 2`, and it finds `i2 == 1` as target. It
sets `next[1]` as well as `tail[1]` to 2.

Now #3 is processed, i.e. it also finds `i2 == 1` as target, so it looks
at next[1], sees that it is already non-negative, so it sets
`next[tail[1]]`, i.e. `next[2]` to 3. It also sets `tail[1]` to 3, but
nobody cares about that because there is no further todo command.

Now, let's follow the code with my v1:

It actually does the same as before! Why, you ask? Because at no stage is
there any non-negative `next[j]` whose corresponding `tail[j]` is
negative. (Except after #3 was processed, at that stage, `next[2]` is
non-negative but `tail[2]` still is negative, but as I said, noone cares
because there are no remaining todo commands.)

So the crucial part to trigger this bug is to have a regular `fixup!
<oneline>` _between_ the `fixup! <oneline>` and the `fixup! <hash>`
targeting the latter. So I think I can modify your example accordingly:

	1234 foo
	5678 fixup! foo
	90ab fixup! foo
	abcd fixup! 5678
	dbaf fixup! 5678

Or using your actual shell commands:

  git commit -m base --allow-empty
  git commit --squash HEAD -m 'this is the first squash' --allow-empty
  s=$(git rev-parse HEAD)
  git commit --fixup HEAD^ --allow-empty # This is the crucial command
  git commit -m "squash! $s" -m 'this is the second squash' --allow-empty
  git commit -m "squash! $s" -m 'this is the third squash' --allow-empty
  git rebase -ki --autosquash --root

Note the cricual command `git commit --fixup HEAD^`. When processing that,
`i == 2` and `i2 == 0` (as for `i == 1`), and before v1, this would have
set `next[1]` but `tail[0]`! With v1, this would have led to #4 and #5
being exchanged. With v2, the role of `tail` would have been extended to
not only talk about the beginning of a fixup/squash chain, but about _any_
target of a fixup/squash, even if it is in the middle of a chain.

So why does this work? Why does it still do the right thing _even after_
inserting a fixup in the middle of a chain?

That's the beauty: if I insert anything in the middle of it, the `tail` of
the actual beginning of the fixup/squash chain won't need to be changed.
It still points to the end of that chain.

All I need to ensure is that item `i` is not just appended to the "chain"
starting at `i2`, but that it is _inserted_ at the end of that chain in
case that it is actually part of a larger chain, i.e. that its `next[i]`
is set correctly before making it the immediate successor of the target
commit. Since all of the elements in `next` and `tail` are initialized to
`-1` (i.e. "no next fixup/squash item after this"), it will even do the
right thing when it should actually append: it will set `next[i]` to `-1`
in that case.

> I'm not really doubting that your v2 works so much as trying to
> un-confuse myself about the whole situation (which in turn might lead to
> a more intelligent review).

I wish I was quicker in my responses because I think that this is really
helpful a conversation. By "forcing my hand" on a thorough explanation,
you really help me get clarity for myself about the actual underlying
issues. So even if I still think that v2 is correct after writing up the
above explanation, the degree of my confidence increased substantially.

Thanks,
Dscho