Re: [PATCH v2 8/8] refs: explicitly propagate errno from refs_read_raw_ref

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Mon, 05 Jul 2021 21:08:50 +0200

On Mon, Jul 05 2021, Han-Wen Nienhuys wrote:

> On Thu, Jul 1, 2021 at 3:06 PM Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote:
>> > -     const char *result = refs_resolve_ref_unsafe_implicit_errno(
>> > -             refs, refname, resolve_flags, oid, flags);
>> > -     *failure_errno = errno;
>> > -     return result;
>> > +     int ignore;
>> > +     return refs_resolve_ref_unsafe_with_errno(refs, refname, resolve_flags,
>> > +                                               oid, flags, &ignore);
>> >  }
>>
>> Okey, so here we have something like the "caller should explicitly
>> ignore it" I suggested in <878s2qdy8y.fsf@xxxxxxxxxxxxxxxxxxx>. We no
>> longer set "errno" at all for the benefit of the callers of that
>> function, but instead explicitly throw it away, our
>> refs_resolve_ref_unsafe_with_errno() no longer sets errno.
>>
>> But this end-state seems to have resulted in introducing new bugs. A
>> bunch of functions are thin wrappers for
>> refs_resolve_ref_unsafe(). Previously they could inspect errno on a -1
>> return, now they can't.
>>
>> I didn't look at them all, but just the second one I looked at,
>> refs_read_ref_full() has a verify_lock() caller which calls it, and that
>
> It looks like you were lucky. I looked at the output of

Maybe so...

> egrep -A7 -B7 --color -nH -e
> '(refs_resolve_refdup|resolve_refdup|refs_read_ref_full|read_ref_full|read_ref|refs_ref_exists|ref_exists|resolve_gitlink_ref|index_path)\('
> *.[ch] */*.[ch]

FWIW this gives better results:

    git grep -E -W '\b(refs_resolve_refdup|resolve_refdup|refs_read_ref_full|read_ref_full|read_ref|refs_ref_exists|ref_exists|resolve_gitlink_ref|index_path)\('

> and the case you pointed out is the only one which inspects errno
> after calling resolve_ref_unsafe (transitively through the grepped
> functions.)

Isn't the `errno == EISDIR` in files_copy_or_rename_ref() another one?
It's just after calling refs_read_ref_full().

Yes, but aside from that it does look like I got lucky. I assumed this
issue was more widespread. Phew!

>> function then expects a meaningful errno that it in turn ferries up with
>> "can't verify ref". It in turn is called by lock_ref_oid_basic(), it
>> ferries up that strbuf with the error, which'll now have a meaningless
>> strerror(0) in it.
>>
>> So it seems to me that the refactoring being done here is halfway done,
>> [..]
>> of what you're starting here. I.e. we have a low-level function that may
>> encounter an errno, and then callers removed from that via:
>>
>>     index_path() -> resolve_gitlink_ref() -> refs_resolve_ref_unsafe()
>>..
>> Where the common case for index_path() is to report the error we got,
>> but it doesn't call die_errno() or error_errno() now, but really should,
>
> "but really should": Why?
>
> Why should all functions that resolve a ref print out system level
> errors? For example, that would entail large changes to sequencer.c
> which does "error" == "does not exist".
>
> Also, the errors you print out will be fundamentally limited. An error
> in resolving HEAD could be caused by a problem in any of the following
> files
>
>    .git/HEAD
>    .git/packed-refs
>    .git/refs/
>    .git/refs/heads/
>    .git/refs/heads/main
>
> Since we have no way of reporting the filename involved, what is a
> user going to do with the error message ("file doesn't exist")? I
> think refs_verify_refname_available() provides a more realistic
> approach: create custom error messages for situations that a user is
> more likely to encounter.

Let me rephrase: "we really shouldn't", as in I agree that this API
sucks and that refs_verify_refname_available() is a much better
approach.

But as long as errno was the abstraction we were passing around it
seemed like e.g. callers like process_directory() should be getting the
errno and using the *_errno() error functions, but of course it would be
much better if they used a better error return flow.

But having looked at the grep above with fresh eyes I think it's fine to
leave it at the refs_resolve_ref_unsafe_with_errno() boundary. The error
messages are clear enough, and we're trying to phase out errno anyway.

I was more paranoid about things that would actually do equality checks
against the errno, as it appearsfiles_copy_or_rename_ref() is still
doing.

> Also, for something like the reftable backend, there are error classes
> that have no direct errno counterpart. For example, if reftable finds
> corrupt zlib data, should it invent a bogus errno code so it can work
> with the errno reporting system?

No, that would be crazy. We should be moving away from errno
entirely. My concern is only that we might be introducing subtle bugs
further up the callstack now that we clear "errno" ...

> I'll take from this review that I should elide the part where errno is
> cleared, and leave it to someone else to figure out a more holistic
> strategy to error reporting.

...or if we don't clear errno introducing those sorts of subtle bugs
when we use reftable instead of the files backend. I.e. no, we really
should be clearing errno, if not in this series then in some other
pre-reftable series.

To not do so would be kicking this particular can down the road, and
leaving those bugs for reftable users to find. Which given that I've
found a few cases with no test coverage...