On Wed, Nov 29, 2023 at 09:14:20AM +0100, Patrick Steinhardt wrote: > We have some references that are more special than others. The reason > for them being special is that they either do not follow the usual > format of references, or that they are written to the filesystem > directly by the respective owning subsystem and thus circumvent the > reference backend. > > This works perfectly fine right now because the reffiles backend will > know how to read those refs just fine. But with the prospect of gaining > a new reference backend implementation we need to be a lot more careful > here: > > - We need to make sure that we are consistent about how those refs are > written. They must either always be written via the filesystem, or > they must always be written via the reference backend. Any mixture > will lead to inconsistent state. > > - We need to make sure that such special refs are always handled > specially when reading them. > > We're already mostly good with regard to the first item, except for > `BISECT_EXPECTED_REV` which will be addressed in a subsequent commit. > But the current list of special refs is missing a lot of refs that > really should be treated specially. Right now, we only treat > `FETCH_HEAD` and `MERGE_HEAD` specially here. > > Introduce a new function `is_special_ref()` that contains all current > instances of special refs to fix the reading path. > > Based-on-patch-by: Han-Wen Nienhuys <hanwenn@xxxxxxxxx> > Signed-off-by: Patrick Steinhardt <ps@xxxxxx> > --- > refs.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 56 insertions(+), 2 deletions(-) > > diff --git a/refs.c b/refs.c > index 7d4a057f36..2d39d3fe80 100644 > --- a/refs.c > +++ b/refs.c > @@ -1822,15 +1822,69 @@ static int refs_read_special_head(struct ref_store *ref_store, > return result; > } > > +static int is_special_ref(const char *refname) > +{ > + /* > + * Special references get written and read directly via the filesystem > + * by the subsystems that create them. Thus, they must not go through > + * the reference backend but must instead be read directly. It is > + * arguable whether this behaviour is sensible, or whether it's simply > + * a leaky abstraction enabled by us only having a single reference > + * backend implementation. But at least for a subset of references it > + * indeed does make sense to treat them specially: > + * > + * - FETCH_HEAD may contain multiple object IDs, and each one of them > + * carries additional metadata like where it came from. > + * > + * - MERGE_HEAD may contain multiple object IDs when merging multiple > + * heads. > + * > + * - "rebase-apply/" and "rebase-merge/" contain all of the state for > + * rebases, where keeping it closely together feels sensible. > + * > + * There are some exceptions that you might expect to see on this list > + * but which are handled exclusively via the reference backend: > + * > + * - CHERRY_PICK_HEAD > + * - HEAD > + * - ORIG_HEAD > + * > + * Writing or deleting references must consistently go either through > + * the filesystem (special refs) or through the reference backend > + * (normal ones). > + */ > + const char * const special_refs[] = { > + "AUTO_MERGE", > + "BISECT_EXPECTED_REV", > + "FETCH_HEAD", > + "MERGE_AUTOSTASH", > + "MERGE_HEAD", > + }; Is there a reason that we don't want to declare this statically? If we did, I think we could drop one const, since the strings would instead reside in the .rodata section. > + int i; Not that it matters for this case, but it may be worth declaring i to be an unsigned type, since it's used as an index into an array. size_t seems like an appropriate choice there. > + for (i = 0; i < ARRAY_SIZE(special_refs); i++) > + if (!strcmp(refname, special_refs[i])) > + return 1; > + > + /* > + * git-rebase(1) stores its state in `rebase-apply/` or > + * `rebase-merge/`, including various reference-like bits. > + */ > + if (starts_with(refname, "rebase-apply/") || > + starts_with(refname, "rebase-merge/")) Do we care about case sensitivity here? Definitely not on case-sensitive filesystems, but I'm not sure about case-insensitive ones. For instance, on macOS, I can do: $ git rev-parse hEAd and get the same value as "git rev-parse HEAD" (on my Linux workstation, this fails as expected). I doubt that there are many users in the wild asking to resolve reBASe-APPLY/xyz, but I think that after this patch that would no longer work as-is, so we may want to replace this with istarts_with() instead. Thanks, Taylor