Re: [PATCH v2 2/3] strbuf: set errno to 0 after strbuf_getcwd

Kyle Lippincott <spectral@xxxxxxxxxx> · Tue, 6 Aug 2024 00:04:30 -0700

On Mon, Aug 5, 2024 at 11:26 PM Patrick Steinhardt <ps@xxxxxx> wrote:
>
> On Mon, Aug 05, 2024 at 08:51:50AM -0700, Junio C Hamano wrote:
> > Eric Sunshine <sunshine@xxxxxxxxxxxxxx> writes:
> >
> > > On Fri, Aug 2, 2024 at 5:32 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
> > >> > [...]
> > >> > Set `errno = 0;` prior to exiting from `strbuf_getcwd` successfully.
> > >> > This matches the behavior in functions like `run_transaction_hook`
> > >> > (refs.c:2176) and `read_ref_internal` (refs/files-backend.c:564).
> > >>
> > >> I am still uneasy to see this unconditional clearing, which looks
> > >> more like spreading the bad practice from two places you identified
> > >> than following good behaviour modelled after these two places.
> > >>
> > >> But I'll let it pass.
> > >>
> > >> As long as our programmers understand that across strbuf_getcwd(),
> > >> errno will *not* be preserved, even if the function returns success,
> > >> it would be OK.  As the usual convention around errno is that a
> > >> successful call would leave errno intact, not clear it to 0, it
> > >> would make it a bit harder to learn our API for newcomers, though.
> > >
> > > For what it's worth, I share your misgivings about this change and
> > > consider the suggestion[*] to make it save/restore `errno` upon
> > > success more sensible. It would also be a welcome change to see the
> > > function documentation in strbuf.h updated to mention that it follows
> > > the usual convention of leaving `errno` untouched upon success and
> > > clobbered upon error.
> > >
> > > [*]: https://lore.kernel.org/git/xmqqv80jeza5.fsf@gitster.g/
> >
> > Yup, of course save/restore would be safer, and probably easier to
> > reason about for many people.
>
> Is it really all that reasonable? We're essentially partitioning our set
> of APIs into two sets, where one set knows to keep `errno` intact
> whereas another set doesn't. In such a world, you have to be very
> careful about which APIs you are calling in a function that wants to
> keep `errno` intact, which to me sounds like a maintenance headache.
>
> I'd claim that most callers never care about `errno` at all. For the
> callers that do, I feel it is way more fragile to rely on whether or not
> a called function leaves `errno` intact or not. For one, it's fragile
> because that may easily change due to a bug. Second, it is fragile
> because the dependency on `errno` is not explicitly documented via code,
> but rather an implicit dependency.
>
> So isn't it more reasonable to rather make the few callers that do
> require `errno` to be left intact to save it? It makes the dependency
> explicit, avoids splitting our functions into two sets and allows us to
> just ignore this issue for the majority of functions that couldn't care
> less about `errno`.

100% agreed. The C language specification says you can't rely on errno
persisting across function calls, and that the caller must preserve it
if it needs that behavior for some reason. The POSIX specification
says you can't either except in very rare circumstances where it
guarantees errno will not change. The Linux man page for errno says
you can't rely on errno not changing, even for printf:
https://man7.org/linux/man-pages/man3/errno.3.html

       A common mistake is to do

           if (somecall() == -1) {
               printf("somecall() failed\n");
               if (errno == ...) { ... }
           }

       where errno no longer needs to have the value it had upon return
       from somecall() (i.e., it may have been changed by the
       printf(3)).  If the value of errno should be preserved across a
       library call, it must be saved:

           if (somecall() == -1) {
               int errsv = errno;
               printf("somecall() failed\n");
               if (errsv == ...) { ... }
           }

Basically: errno is _extremely_ volatile. One should assume that
_every_ function call is going to change it, even if they return
successfully. The only thing that can't happen is that the functions
defined in the C and POSIX standards set errno to 0, which is why I
withdrew the patch (since it's a wrapper around a function defined in
POSIX). But in general, I don't see any reason for any of the
functions we write to be errno preserving, especially since any call
to malloc, printf, trace functionality, etc. may modify errno.

>
> Patrick