On Fri, Dec 27, 2024 at 09:16:27PM +0100, René Scharfe wrote: > Am 27.12.24 um 11:33 schrieb Patrick Steinhardt: > > On Wed, Dec 25, 2024 at 07:38:29PM +0100, René Scharfe wrote: > >> When realloc(3) fails, it returns NULL and keeps the original allocation > >> intact. REFTABLE_ALLOC_GROW overwrites both the original pointer and > >> the allocation count variable in that case, simultaneously leaking the > >> original allocation and misrepresenting the number of storable items. > >> > >> parse_names() and reftable_buf_add() avoid leaking by restoring the > >> original pointer value on failure, but all other callers seem to be OK > >> with losing the old allocation. Add a new variant of the macro, > >> REFTABLE_ALLOC_GROW_OR_NULL, which plugs the leak and zeros the > >> allocation counter. Use it for those callers. > > > > Hm, okay. I find it a bit curious to discern those two macros from each > > other as all callers need to handle OOM errors anyway, so doing the safe > > thing should likely be our default here and all callsites that don't > > should be adapted, shouldn't they? > > I agree, and I my first version only had REFTABLE_ALLOC_GROW. Keeping > stuff unchanged if we cannot grow should be safer, right? But it would > introduce a leak if the caller exits without cleaning up, so each of > them needs to be audited. I was too lazy for that. And it's work that > can be parallelized.. Fair enough. > > In the case of `reftable_buf_add()` I kind of doubt the usefulness of > > handling the error just to keep the old pointer intact, as all callsites > > will ultimately error out anyway. > > I can imagine use cases where an object is built piece by piece, one > part is too large and then you still want to keep all the rest and just > replace the huge thing with a placeholder or entirely ignore it. Could > be a case of YAGNI, though. Probably. > > But in the case of `parse_names()` we > > do in fact want to handle the case specially so that we can free any > > names we have already parsed, so that case makes sense indeed. > > Yes. But that leads me on a tangent: Is it really a good idea to load > a file into lots of individual string objects instead of loading into > a single big buffer and pointing directly into it? Do those strings > need to have individual lifetimes? Good question indeed. I don't think we ever need individual lifetimes here. On the other hand it's probably okayish, too, given that the number of table names should be limited due to automatic compaction. > > So there is merit in having two separate wrappers, but it would be nice > > if `REFTABLE_ALLOC_GROW()` would be doing the "right thing" for most > > cases while the above two callsites would be adapted to use a wrapper > > that requires a bit more thought to use correctly. For example something > > like `REFTABLE_TRY_ALLOC_GROW()` or similar. > > So this is about naming? And with "right thing" you mean failing to > grow should lead to destruction? Yup, exactly. I just want to give callers a better indicator which of these functions does what, and having sane defaults helps in my opinion. I guess this also depends on whether or not we want to eventually adapt all callsites to handle allocation errors themselves. Patrick