Re: [PATCH v2 1/4] wt-status: read HEAD and ORIG_HEAD via the refdb

Patrick Steinhardt <ps@xxxxxx> · Thu, 14 Dec 2023 14:21:35 +0100

On Tue, Dec 12, 2023 at 12:24:24PM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@xxxxxx> writes:
[snip]
> > diff --git a/wt-status.c b/wt-status.c
> > index 9f45bf6949..fe9e590b80 100644
> > --- a/wt-status.c
> > +++ b/wt-status.c
> > @@ -1295,26 +1295,27 @@ static char *read_line_from_git_path(const char *filename)
> >  static int split_commit_in_progress(struct wt_status *s)
> >  {
> >  	int split_in_progress = 0;
> > -	char *head, *orig_head, *rebase_amend, *rebase_orig_head;
> > +	struct object_id head_oid, orig_head_oid;
> > +	char *rebase_amend, *rebase_orig_head;
> >  
> >  	if ((!s->amend && !s->nowarn && !s->workdir_dirty) ||
> >  	    !s->branch || strcmp(s->branch, "HEAD"))
> >  		return 0;
> >  
> > -	head = read_line_from_git_path("HEAD");
> > -	orig_head = read_line_from_git_path("ORIG_HEAD");
> > +	if (read_ref_full("HEAD", RESOLVE_REF_NO_RECURSE, &head_oid, NULL) ||
> > +	    read_ref_full("ORIG_HEAD", RESOLVE_REF_NO_RECURSE, &orig_head_oid, NULL))
> > +		return 0;
> > +
> 
> This made me wonder if we have changed behaviour when on an unborn
> branch.  In such a case, the original most likely would have read
> "ref: blah" in "head" and compared it with "rebase_amend", which
> would be a good way to ensure they would not match.  I would not
> know offhand what the updated code would do, but head_oid would be
> uninitialized in such a case, so ...?

The code flow goes as following:

  1. We call `read_ref_full()`, which itself calls
     `refs_resolve_ref_unsafe()`.

  2. It calls `refs_read_raw_ref()`, which succeeds and finds the
     symref.

  3. We notice that this is a symref and that `RESOLVE_REF_NO_RECURSE`
     is set. We thus clear the object ID and return the name of the
     symref target.

  4. Back in `read_ref_full()` we see that `refs_resolve_ref_unsafe()`
     returns the symref target, which we interpret as successful lookup.
     We thus return `0`.

  5. Now we look up "rebase-merge/{amend,orig-head}" and end up
     comparing whatever they contain with the cleared OID resolved from
     HEAD/ORIG_HEAD.

So the OID would not be uninitialized but the zero OID. Now:

  - "rebase-merge/amend" always contains the result of `repo_get_oid()`
    and never contains the zero OID.

  - "rebase-merge/orig-head" may contain the zero OID when there was no
    ORIG_HEAD at the time of starting a rebase or in case it did not
    resolve

So... if ORIG_HEAD was rewritten to be a symref pointing into nirvana
between starting the rebase and calling into "wt-status.c", and when
ORIG_HEAD didn't exist at the time of starting the rebase, then we might
now wrongly report that splitting was in progress.

In other cases the old code was actually doing the wrong thing. Suppose
that ORIG_HEAD was a dangling symref, then we'd have written the zero
OID into "rebase-merge/orig-head". But when calling into "wt-status.c"
now we read the still-dangling symref value and notice that the zero OID
is different than "ref: refs/heads/dangling".

I dunno. It feels like this is one of many cases where as you start to
think deeply about how things behave you realize that it's been broken
all along. On the other hand, I doubt there was even a single user who
ever experienced this issue. It often just needs to be correct enough.

I think the best way to go about this is to check for `REF_ISSSYMREF`
and exit early in that case. We only want to compare direct refs, so
this is closer to the old behaviour and should even fix edge cases like
the above.

Will update.

Patrick
Attachment:
signature.asc

Description: PGP signature