On Mon, Dec 19, 2022 at 10:20:00AM +0100, Ævar Arnfjörð Bjarmason wrote: > I hear you, but I also think you're implicitly conflating two things > here. > > There's the question of whether we should in general optimize for safety > over more optimila memory use. I.e. if we simply have every strvec, > string_list etc. own its memory fully we don't need to think as much > about allocation or ownership. > > I think we should do that in general, but we also have cases where we'd > like to not do that, e.g. where we're adding thousands of strings to a > string_list, which are all borrewed from elsewhere, except for a few > we'd like to xstrdup(). > > Such API use *is* tricky, but I think formalizing it as the > "string_list" does is making it better, not worse. In particular...: I think the two things are related, though, because "safety" is "people not mis-using the API". Once you give the API the option to do the trickier thing, people will do it, and they will do it badly. That has been my experience with the string-list API (especially that people try to do clever things with transferring ownership by using a non-duplicating list and then setting strdup_strings midway through the function). > > I would disagree that this hasn't been an issue in practice. A few > > recent examples: > > > > - 5eeb9aa208 (refs: fix memory leak when parsing hideRefs config, > > 2022-11-17) > > - 7e2619d8ff (list_objects_filter_options: plug leak of filter_spec > > strings, 2022-09-08) > > - 4c81ee9669 (submodule--helper: fix "reference" leak, 2022-09-01) > > ...it's funny that those are the examples I would have dug up to argue > that this is a good idea, and to add some: > > - 4a0479086a9 (commit-graph: fix memory leak in misused > string_list API, 2022-03-04) > - b202e51b154 (grep: fix a "path_list" memory leak, 2021-10-22) > > I.e. above you note "in both directions [...] leaks [...] and double > frees", but these (and the ones I added) are all in the second category. I almost didn't give a list, because it's hard to dig into all of the details. For example the second one in my list, which I worked on, did fix things by consistently setting strdup_strings, and it was "just" a leak fix. But it took me a full day to untangle the mess of that code, and there were intermediate states were we did access dangling pointers before I got to a working order of the series. And all of it was complicated by the fact that string_list is configurable, so different bits of the code expected different things. Even after that commit, there's a horrible hack to set the strdup field and hope it's enough. If that were the only time I'd wrestled with string_list, I'd be less concerned. But (subjectively) this feels like the latest in a long line of bugs and irritations. -Peff