Re: [PATCH 11/12] builtin/show-ref: add new mode to check for reference existence

Patrick Steinhardt <ps@xxxxxx> · Wed, 25 Oct 2023 13:50:57 +0200

On Tue, Oct 24, 2023 at 05:01:55PM -0400, Eric Sunshine wrote:
> On Tue, Oct 24, 2023 at 9:11 AM Patrick Steinhardt <ps@xxxxxx> wrote:
> > While we have multiple ways to show the value of a given reference, we
> > do not have any way to check whether a reference exists at all. While
> > commands like git-rev-parse(1) or git-show-ref(1) can be used to check
> > for reference existence in case the reference resolves to something
> > sane, neither of them can be used to check for existence in some other
> > scenarios where the reference does not resolve cleanly:
> >
> >     - References which have an invalid name cannot be resolved.
> >
> >     - References to nonexistent objects cannot be resolved.
> >
> >     - Dangling symrefs can be resolved via git-symbolic-ref(1), but this
> >       requires the caller to special case existence checks depending on
> >       whteher or not a reference is symbolic or direct.
> 
> s/whteher/whether/
> 
> > Furthermore, git-rev-list(1) and other commands do not let the caller
> > distinguish easily between an actually missing reference and a generic
> > error.
> >
> > Taken together, this gseems like sufficient motivation to introduce a
> 
> s/gseems/seems/
> 
> > separate plumbing command to explicitly check for the existence of a
> > reference without trying to resolve its contents.
> >
> > This new command comes in the form of `git show-ref --exists`. This
> > new mode will exit successfully when the reference exists, with a
> > specific error code of 2 when it does not exist, or with 1 when there
> > has been a generic error.
> >
> > Note that the only way to properly implement this command is by using
> > the internal `refs_read_raw_ref()` function. While the public function
> > `refs_resolve_ref_unsafe()` can be made to behave in the same way by
> > passing various flags, it does not provide any way to obtain the errno
> > with which the reference backend failed when reading the reference. As
> > such, it becomes impossible for us to distinguish generic errors from
> > the explicit case where the reference wasn't found.
> >
> > Signed-off-by: Patrick Steinhardt <ps@xxxxxx>
> > ---
> > diff --git a/Documentation/git-show-ref.txt b/Documentation/git-show-ref.txt
> > @@ -65,6 +70,12 @@ OPTIONS
> > +--exists::
> > +
> > +       Check whether the given reference exists. Returns an error code of 0 if
> 
> We probably want to call this "exit code" rather than "error code"
> since the latter is unnecessarily scary sounding for the success case
> (when the ref does exit).

I was trying to stick to the preexisting style of "error code" in this
manual page. But I think I agree with your argument that we also call it
an error code in the successful case, which is misleading.

> > +       it does, 2 if it is missing, and 128 in case looking up the reference
> > +       failed with an error other than the reference being missing.
> 
> The commit message says it returns 1 for a generic error, but this
> inconsistently says it returns 128 for that case. The actual
> implementation returns 1.

Good catch, fixed.

> > diff --git a/builtin/show-ref.c b/builtin/show-ref.c
> > @@ -214,6 +215,41 @@ static int cmd_show_ref__patterns(const struct patterns_options *opts,
> > +static int cmd_show_ref__exists(const char **refs)
> > +{
> > +       struct strbuf unused_referent = STRBUF_INIT;
> > +       struct object_id unused_oid;
> > +       unsigned int unused_type;
> > +       int failure_errno = 0;
> > +       const char *ref;
> > +       int ret = 1;
> > +
> > +       if (!refs || !*refs)
> > +               die("--exists requires a reference");
> > +       ref = *refs++;
> > +       if (*refs)
> > +               die("--exists requires exactly one reference");
> > +
> > +       if (refs_read_raw_ref(get_main_ref_store(the_repository), ref,
> > +                             &unused_oid, &unused_referent, &unused_type,
> > +                             &failure_errno)) {
> > +               if (failure_errno == ENOENT) {
> > +                       error(_("reference does not exist"));
> 
> The documentation doesn't mention this printing any output, and indeed
> one would intuitively expect a boolean-like operation to not produce
> any printed output since its exit code indicates the result (except,
> of course, in the case of a real error).

I'm inclined to leave this as-is. While the exit code should be
sufficient, I think it's rather easy to wonder whether it actually did
anything at all and why it failed in more interactive use cases. Not
that I think these will necessarily exist.

I also don't think it's going to hurt to print this error. If it ever
does start to become a problem we might end up honoring the "--quiet"
flag to squelch this case.

> > +                       ret = 2;
> > +               } else {
> > +                       error(_("failed to look up reference: %s"), strerror(failure_errno));
> 
> Or use error_errno():
> 
>     errno = failure_errno;
>     error_errno(_("failed to look up reference: %s"));

Ah, good suggestion.

> > +               }
> > +
> > +               goto out;
> > +       }
> > +
> > +       ret = 0;
> > +
> > +out:
> > +       strbuf_release(&unused_referent);
> > +       return ret;
> > +}
> 
> It's a bit odd having `ret` be 1 at the outset rather than 0, thus
> making the logic a bit more difficult to reason about. I would have
> expected it to be organized like this:
> 
>     int ret = 0;
>     if (refs_read_raw_ref(...)) {
>          if (failure_errno == ENOENT) {
>             ret = 2;
>         } else {
>             ret = 1;
>             errno = failure_errno;
>             error_errno(_("failed to look up reference: %s"));
>        }
>     }
>     strbuf_release(...);
>     return ret;

Fair enough. I've seen both styles used in our codebase, but ultimately
don't care much which of either we use here. Will adapt.

> > @@ -272,13 +309,15 @@ int cmd_show_ref(int argc, const char **argv, const char *prefix)
> > +       if ((!!exclude_existing_opts.enabled + !!verify + !!exists) > 1)
> > +               die(_("only one of --exclude-existing, --exists or --verify can be given"));
> 
> When reviewing an earlier patch in this series, I forgot to mention
> that we can simplify the life of translators by using placeholders:
> 
>     die(_("options '%s', '%s' or '%s' cannot be used together"),
>         "--exclude-existing", "--exists", "--verify");
> 
> which ensures that they don't translate the literal option names, and
> makes it possible to reuse the translated message in multiple
> locations (since it doesn't mention hard-coded option names).

Done.

Thanks for your review, highly appreciated! I'll wait until tomorrow for
additional feedback and then send out v2.

Patrick
Attachment:
signature.asc

Description: PGP signature