Re: [PATCH] reftable: use xmalloc() and xrealloc()

Patrick Steinhardt <ps@xxxxxx> · Mon, 8 Apr 2024 18:33:37 +0200

On Mon, Apr 08, 2024 at 08:42:19AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@xxxxxx> writes:
> 
> > These are part of the library interfaces that should ideally not be tied
> > to the Git code base at all so that they can theoretically be reused by
> > another project like libgit2. So I think that instead of rewriting the
> > generic fallbacks we should call `reftable_set_alloc()` somewhen early
> > in Git's startup code.
> 
> It sounds like a sensible approach to me on the surface.
> 
> The reftable_subsystem_init() function, which would be the interface
> into "reftable library" from Git side, can call such customization
> functions supplied by the library.
> 
> > It does raise the question what to do about the generic fallbacks.
> 
> Generic fallbacks would be a plain vanilla malloc(), realloc(), and
> friends, right?

Yeah.

> > We could start calling `abort()` when we observe allocation
> > failures. It's not exactly nice behaviour in a library though,
> > where the caller may in fact want to handle this case.
> 
> But they would not be able to "handle" it, unless their substitute
> alloc() function magically finds more memory and never runs out.  If
> you want to allow them to "handle" the situation, the library itself
> needs be prepared to see NULL returned from the allocator, and fail
> the operation it was doing, and return an error.  If the caller asks
> reftable_write_foo(), which may need to allocate some memory to
> finish its work, it would see NULL and say "sorry, I cannot
> continue", and return an error to its caller, I would imagine.
> 
> I think there are two levels of "handling" allocation and its
> failure.  Substituting allocation functions would be a way to solve
> only one of them (i.e. somehow allow the library client to specify a
> way to supply you an unbounded amount of memory).  As long as the
> library is not willing to check allocation failures and propagate
> the error to the caller, you would have to "abort" the operation no
> matter what before returning the control back to your client, and at
> that point you would start wanting to make it customizable how to
> "abort".

I actually think that the reftable library _should_ be willing to check
for allocation failures and return proper error codes to the caller.
That would be quite an undertaking, but there is no need to do it all in
a single go. We can refactor the code over time to start handling such
failures.

> Their custom "abort" function might do longjmp() to try to "recover",
> or simply call die() in our case where Git is the library client, I
> guess.  So reftable_set_alloc() and reftable_set_abort() may need to
> be there.  If you make it mandatory to call them, you can punt and
> make it the responsibility of the library clients to worry about error
> handling, I guess?

That would be a possibility indeed. A custom "failure" function may try
to e.g. release caches such that the allocation can be retried. And if
everything fails then in theory, the caller could do a longjmp(3P).

In practice this could cause all kinds of problems though. Imagine for
example that we have acquired a lockfile and then subsequently an
allocation fails. If the application were to longjmp(3P) then all the
cleanup code would not be invoked at all, thus leaving behind a stale
lockfile.

Overall I think that handling allocation failures is the more flexible
approach in the long run, even though it requires more work.

Patrick
Attachment:
signature.asc

Description: PGP signature