Re: [PATCH 1/8] refs: remove EINVAL specification from the errno sideband in read_raw_ref_fn

Han-Wen Nienhuys <hanwen@xxxxxxxxxx> · Wed, 19 May 2021 14:25:37 +0200

On Fri, Apr 30, 2021 at 4:38 AM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> "Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
>
> > From: Han-Wen Nienhuys <hanwen@xxxxxxxxxx>
> >
> > A grep for EINVAL */*c reveals that no code inspects EINVAL after reading
> > references.
>
> We often use a pattern (which is common) like this:
>
>         if (some_func_in_ref_API(...) < 0) {
>                 if (errno == ENOENT || errno == EISDIR)
>                         ... it is OK for the file to be missing ...
>                 else
>                         ... error ...
>         }
>
> If a piece of code currently sets EINVAL to errno manually when
> signalling a failure by returning a negative value to communicate
> with such a caller, we wouldn't see EINVAL mentioned, so such a grep
> alone would not help us guarantee the correctness of an update to
> stop assignment of EINVAL at all.  The callers must be vetted more
> carefully than "we are happy that nobody explicitly mentions EINVAL".

Sure. I looked at the callers, and documented further what I looked
for. But how far should one go? It's a global variable, so
transitively, almost all of the code could be observing the EINVAL
value under very specific circumstances. But it would also be a
terrible, fragile coding style and use of undocumented behavior.

> > The files ref backend does use EINVAL so parse_loose_ref_contents() can
> > communicate to lock_raw_ref() about garbage following the hex SHA1, or a short
> > read in files_read_raw_ref(), but the files backend does not call into
> > refs_read_raw_ref(), so its EINVAL sideband error is unused.
>
> This paragraph is confusing.  It says EINVAL is used to signal
> lock_raw_ref(), and it says EINVAL is not used by the same files
> backend.  Which is correct?  If one part of the backend uses it, and
> other parts don't, wouldn't the backend as a whole still use it?

I tried to clarify the message. files-backend.c makes assumptions
about the errno return for files_read_raw_ref, but it's not making
assumptions about the abstracted API in refs.h

> That is because there is a codeflow like this:
>
>         if (files_read_raw_ref(...)) {
>                 if (errno == ENOENT) {
>                         ... do various things ...
>                 } else if (errno == EISDIR) {
>                         ... do different and various things ...
>                 } else if (errno == EINVAL && (*type & REF_ISBROKEN)) {
>                         ... deal with broken ref ...
>                 }
>                  ...
>         }

as mentioned above, this isn't calling refs_read_raw_ref, so it's not
affected by this patch.

> > + * Return 0 on success. If the ref doesn't exist, set errno to ENOENT and return
> > + * -1. If the ref exists but is neither a symbolic ref nor an object ID, it is
> > + * broken; set REF_ISBROKEN in type, and return -1. If there is another error
> > + * reading the ref, set errno appropriately and return -1.
>
> So, this is not sufficient to let caller correctly and safely handle
> errors.  "set REF_ISBROKEN in type, set errno to something other
> than ENOENT or EISDIR, and then return -1" is necessary, I would
> think.

I tweaked the comment. Note that the function has only a handful of
callers (and only one caller where this behavior is relevant), and
it's changed in a follow-on patch in this series. Is it worth the
effort to wordsmith this further?

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado