Re: free and errno, was Re: [PATCH] apply: replace mksnpath() with a mkpathdup() call

René Scharfe <l.s.r@xxxxxx> · Sun, 14 Apr 2024 17:17:23 +0200

Am 07.04.24 um 03:18 schrieb Jeff King:
> On Sat, Apr 06, 2024 at 07:45:17PM +0200, René Scharfe wrote:
>
>> I calmed down a bit now.  And ask myself how widespread the issue actually
>> is.  Used the following silly Coccinelle rule to find functions that use
>> errno after a free(3) call:
>>
>> @@
>> @@
>> - free(...);
>>   ...
>>   errno
>>
>> Found only a handful of places, and they all set errno explicitly, so
>> they are fine.
>
> Is that enough, though? Imagine I have:
>
>   int foo(const char *x)
>   {
>      char *y = xstfrmt("something/%s", x);
>      int fd = open(y, ...);
>      free(y);
>      return fd;
>   }
>
> Then errno is important if some caller of foo() checks errno after foo()
> returns an error. And that caller might not even be in the same file.

Yes.

> In other words, it really depends on the contract of foo() with respect
> to errno. And I don't think there is a way in C to specify that
> contract in away that the compiler can understand.

The context attribute of sparse could be used, I guess.  We'd have to
duplicate the declaration of all library functions, though, to tag them
with a may_have_changed_errno attribute.  And we'd need to clear it on
success, so we'd have to modify all callers.  On the flip side it would
allow detecting consecutive errno changes, e.g. by two I/O functions
with no error checking in between.  But the implementation effort seems
way too high.

It would be easier if the error information was in one place instead of
one bit in the return value (NULL or less than 0) and the rest in errno.

>> No idea how to match any use of errno except assignment.  And no idea how
>> to find indirect callers of free(3) that use errno with no potential
>> assignment in between.
>
> Yeah, I guess the indirect callers of free() are really the flip-side.
> My example was indirect users of errno. ;)

Indeed.

>>> The other reason is that macros (especially of system names) can create
>>> their own headaches.  We could require xfree() everywhere as a
>>> complement to xmalloc (or maybe even just places where the errno
>>> preservation seems useful), but that's one more thing to remember.
>>
>> An xfree() to go along with xmalloc()/xrealloc()/xcalloc()/xstrdup() would
>> fit in nicely and might be easier to remember than free() after a while.
>> Having to convert thousands of calls is unappealing, though.
>
> My biggest concern with it is that we'd have to remember to use it, and
> there's not good compiler enforcement. But I guess coccinelle can help
> us there.q

Yes, converting all free(3) calls is easy with Coccinelle, and the same
semantic patch can be used to enforce the use of xfree().  Still the
initial diff would be huge (2000+ changed lines in 300+ files).

> My secondary concern is that it might make people think that xmalloc()
> and xfree() are always paired, and thus you can do clever things in one
> as long as the other matches it. But we sometimes free memory from
> system routines like getline(). Maybe a scary comment would be enough?

Amazing foresight!  Currently we only use getline(3) (which can act like
realloc(3)) in contrib/credential/, though.  Any others?

The "x" prefix doesn't promise exclusivity (hermetic seal?), but it
certainly suggests there are some shared feature between the allocator
functions and xfree(), while they only have in common that the are
wrapped.  We could call the latter errno_preserving_free() instead or so.

I'm more worried about the overhead.  For a 0-1% slowdown (in the case of
git log) git_free() or xfree() give us accurate error numbers on all
platforms.  Non-misleading error codes are worth a lot (seen my share of
cursed messages), but xfree() is only necessary in the error handling
code.  The happy path can call free(3) directly without harm.

But how to classify call sites accordingly?  It's easy for programmers
once they know it's necessary.  Is there a way in C, macro or Coccinelle
to have our cake and only eat it if really needed?  I don't see any. :-|

>> Found four places that did not expect free(3) to mess up their errno by
>> running the test suite with that.  Patch below.
>
> These are perhaps worth fixing (though not if we come up with a
> universal solution). But I'd be surprised if they are the only ones. By
> its nature, this problem only manifests when there are actual errors,
> and our test suite is mostly checking happy paths. So I'd assume there
> are a ton of "if (ret < 0) { free(foo); return -1; }" spots that are
> simply not exercised by the test suite at all.

Makes sense.

René