On Wed, May 29, 2024 at 05:16:33AM -0400, Jeff King wrote: > On Mon, May 27, 2024 at 08:44:46AM +0200, Patrick Steinhardt wrote: > > > diff --git a/strbuf.c b/strbuf.c > > > diff --git a/strbuf.c b/strbuf.c > > > index e1076c9891..aed699c6bf 100644 > > > --- a/strbuf.c > > > +++ b/strbuf.c > > > @@ -656,10 +656,8 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) > > > * we can just re-init, but otherwise we should make sure that our > > > * length is empty, and that the result is NUL-terminated. > > > */ > > > - if (!sb->buf) > > > - strbuf_init(sb, 0); > > > - else > > > - strbuf_reset(sb); > > > + FREE_AND_NULL(sb->buf); > > > + strbuf_init(sb, 0); > > > return EOF; > > > } > > > #else > > > > > > But I think either of those would solve your leak, _and_ would help with > > > similar leaks of strbuf_getwholeline() and friends. > > > > I'm not quite convinced that `strbuf_getwholeline()` should deallocate > > the buffer for the caller, I think that makes for quite a confusing > > calling convention. The caller may want to reuse the buffer for other > > operations, and it feels hostile to release the buffer under their feet. > > > > The only edge case where I think it would make sense to free allocated > > data is when being passed a not-yet-allocated strbuf. But I wonder > > whether the added complexity would be worth it. > > I'm not sure what they'd reuse it for. We necessarily have to reset it > before reading, so the contents are now garbage. The allocated buffer > could be reused, but since everybody has to call strbuf_grow() before > assuming they can write, it's not a correctness issue, but only an > optimization. But that optimization is pretty unlikely to matter. Since > we hit this code only on EOF or error, it's generally going to happen > once in a program, and not in a tight loop. > > If we really cared, though, I think you could check sb->alloc before the > call to getdelim(), and then we'd know whether the original held an > allocation or not (and we could restore its state). That's what other > syscall-ish strbuf functions like strbuf_readlink() and strbuf_getcwd() > do. Ah, I didn't know that we did similar things in other strbuf functions. With that precedence I think it's less ugly to do this dance. > That said, I agree that leaks here are not going to be common. Most > callers are going to call it in a loop and unconditionally release at > the end, whether they get multiple lines or not. The "append" function > is the odd man out by reading a single line into a new buffer[1]. > > Looking through the results of: > > git grep -P '(?<!while) \(!?strbuf_get(whole)?line' > > I saw only one questionable case. builtin/difftool.c does: > > if (strbuf_getline_nul(&lpath, fp)) > break; > > without freeing lpath. But then...it does not free it in the case that > we got a value, either! So I think it is leaking either way, and the > solution, to strbuf_release(&lpath) outside of the loop, would fix both > cases. Indeed. We also didn't free `rpath` and `info`. I do have a follow up to this series already, so let me add those leak fixes to it. > > I've been going through all callsites and couldn't spot any that doesn't > > free the buffer on EOF. So I'd propose to leave this as-is and revisit > > if we eventually see that this is causing more memory leaks. > > OK. I don't feel too strongly about it, but mostly thought it seemed > inconsistent with the philosophy of those other strbuf functions. I get where you're coming from now with the additional info that other syscall-ish functions do a similar dance. I'll refrain from rerolling this series just to fix this in a different way, also because neither of us did spot any additional leaks caused by this. Patrick
Attachment:
signature.asc
Description: PGP signature