Re: [RFC PATCH 0/5] strvec: add a "nodup" mode, fix memory leaks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 19, 2022 at 10:20:00AM +0100, Ævar Arnfjörð Bjarmason wrote:

> I hear you, but I also think you're implicitly conflating two things
> here.
> 
> There's the question of whether we should in general optimize for safety
> over more optimila memory use. I.e. if we simply have every strvec,
> string_list etc. own its memory fully we don't need to think as much
> about allocation or ownership.
> 
> I think we should do that in general, but we also have cases where we'd
> like to not do that, e.g. where we're adding thousands of strings to a
> string_list, which are all borrewed from elsewhere, except for a few
> we'd like to xstrdup().
> 
> Such API use *is* tricky, but I think formalizing it as the
> "string_list" does is making it better, not worse. In particular...:

I think the two things are related, though, because "safety" is "people
not mis-using the API". Once you give the API the option to do the
trickier thing, people will do it, and they will do it badly. That has
been my experience with the string-list API (especially that people try
to do clever things with transferring ownership by using a
non-duplicating list and then setting strdup_strings midway through the
function).

> > I would disagree that this hasn't been an issue in practice. A few
> > recent examples:
> >
> >   - 5eeb9aa208 (refs: fix memory leak when parsing hideRefs config,
> >     2022-11-17)
> >   - 7e2619d8ff (list_objects_filter_options: plug leak of filter_spec
> >     strings, 2022-09-08)
> >   - 4c81ee9669 (submodule--helper: fix "reference" leak, 2022-09-01)
> 
> ...it's funny that those are the examples I would have dug up to argue
> that this is a good idea, and to add some:
> 
> 	- 4a0479086a9 (commit-graph: fix memory leak in misused
>           string_list API, 2022-03-04)
> 	- b202e51b154 (grep: fix a "path_list" memory leak, 2021-10-22)
> 
> I.e. above you note "in both directions [...] leaks [...] and double
> frees", but these (and the ones I added) are all in the second category.

I almost didn't give a list, because it's hard to dig into all of the
details. For example the second one in my list, which I worked on, did
fix things by consistently setting strdup_strings, and it was "just" a
leak fix. But it took me a full day to untangle the mess of that code,
and there were intermediate states were we did access dangling pointers
before I got to a working order of the series.  And all of it was
complicated by the fact that string_list is configurable, so different
bits of the code expected different things. Even after that commit,
there's a horrible hack to set the strdup field and hope it's enough.

If that were the only time I'd wrestled with string_list, I'd be less
concerned. But (subjectively) this feels like the latest in a long line
of bugs and irritations.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux