Re: [PATCH v3 0/8] refs: add symref support to 'git-update-ref'

Jeff King <peff@xxxxxxxx> · Fri, 26 Apr 2024 16:41:45 -0400

On Wed, Apr 24, 2024 at 09:25:27AM -0700, Karthik Nayak wrote:

> This also brings light onto the previous versions we were considering:
> 
>     symref-update SP <ref> SP <new-target> [SP (<old-target> | <old-oid>)] LF
> 
> There is also some ambiguity here which we missed, especially when we
> support dangling refs. If we're updating a dangling ref <ref>, and we
> provide an old value. Then there is uncertainty around whether the
> provided value is actually a <old-target> or if it's an <old-oid>.
> 
> For non dangling ref symref, we first parse it as an <old-target> and
> since the <old-target> would exist, we can move on.
> 
> So I see two ways to go about this,
> 
> 1. In the symref-update function, we need to parse and see if <ref> is a
> regular ref or a symref, if it is symref, we simply set the provided old
> value as <old-target>, if not, we set it as <old-oid>. This seems clunky
> because we'll be parsing the ref and trying to understand its type in
> 'update-ref.c', before the actual update.

I think this avoids the "mischief" case I mentioned because it is about
looking at what is in <ref> currently, not spooky action from a
possibly-unrelated "refs/heads/ref". But in general, I think it is a
good thing if we can tell what the caller is asking for based only on
the syntax of the request, without taking into account repository state.
It just makes things easier to reason about.

> 2. We change the syntax to something like
> 
>     symref-update SP <ref> SP <new-ref> [SP (ref <old-target> | oid
> <old-oid>)] LF
> 
> this would remove any ambiguity since the user specifies the data type
> they're providing.

Yeah, I was going to suggest that it could be resolved with any syntax
that would not be a valid oid name. Certainly check-ref-format places
some restrictions there (e.g., no "^") but name resolution relies on
that, too (so foo^{tree} is a valid name). Probably something like
"^foo" is unambiguous, but it's ugly and hard to explain. ;)

But I think your "ref <old-target>" solves that. Resolved names can have
spaces in them, but only after a "^" (e.g., "foo^{/some regex}") or ":"
(e.g., "foo:path with spaces"). So seeing just "ref" by itself, followed
by a space, I think is unambiguous. And it looks pretty.

It gets a little tricky when the field delimiter is also space, and the
item in question is not the final field. See below.

> Also on a sidenote, it's worth considering that with the direction of
> [2], we could also extrapolate to introduce {verify, update, create,
> delete} v2, which support both symrefs and regular refs. But require
> explicit types from the user:
> 
>     update-v2 SP <ref> NUL (oid <new-oid> | ref <new-target>) NUL
> [(oid <old-oid> | ref <old-target>)] NUL
> 	create-v2 SP <ref> NUL (oid <new-oid> | ref <new-target>) NUL
> 	delete-v2 SP <ref> NUL [(oid <old-oid> | ref <old-target>)] NUL
> 	verify-v2 SP <ref> NUL [(oid <old-oid> | ref <old-target>)] NUL
> 
> This is similar to the v3 patches I've currently sent out, in that it
> would also allow cross operations between regular refs and symrefs.

So I _think_ that "<oid> | ref <symref-target>" is unambiguous. In which
case you could just have:

  update SP <ref> NUL (<new-oid> | ref <new-target>) NUL [(<old-oid> | ref <old-target>)]

which is backwards-compatible.

With the NUL separator it's easy to parse, because "ref " with a
trailing space always means a ref, even in the "new" field. But with
spaces instead, it gets weird. If you have:

  update refs/heads/foo refs refs/heads/bar

it can either be creating a symref to "bar" (with no "old" specifier) or
it could be pointing it at the resolved-name "refs", which the old value
coming from "bar".

I guess one option would be to only allow "ref" syntax in "-z" mode, but
that is probably getting to be a bit weird and hard to explain.

-Peff