On Fri, Jul 16 2021, Jeff King wrote: [Very late reply, just getting back to this thread] > On Fri, Jul 16, 2021 at 09:46:33AM +0200, Ævar Arnfjörð Bjarmason wrote: > >> > I can't say I _love_ any of that, but I think it would work (and >> > probably we'd adapt our helpers like git_config_pathname() to take a >> > def_string. Or I guess just have a def_string_free() which can be called >> > before writing into them). >> > >> > But maybe there's a better solution I'm missing. >> >> Instead of: "int from_heap" in your "def_string" I think we should just >> use "struct string_list_item". I.e. you want a void* here. Why? > > Yes, an equivalent way to write it is with a separate to_free buffer. > But why would we want it to be void? And why would we want to use a > string_list_item, which is otherwise unrelated? We could factor "string_list_item" out into a "string_pair" or whatever. Sorry, I didn't mean to get into the naming/code location aspect of the discussion, just the sensibility of using a "char */void * pair" for these common memory management cases, v.s. your suggestion of having a "char *, int from_heap" pair. >> Well, tying this back to my clear() improvements for string-list.h I >> thought a really neat solution to this was: >> >> string_list_append(list, ptr)->util = on_heap_now_we_own_it; >> string_list_append(list, mydup)->util = mydup; >> >> I.e. by convention we store the pointer we need to free (if any) in the >> "util" field. > > That works, but now "util" is not available for all the _other_ uses for > which it was intended. And if we're not using it for those other uses, > then why does it need to exist at all? If we are only using it to hold > the allocated string pointer, then shouldn't it be "char *to_free"? Because having it be "char *" doesn't cover the common case of e.g. getting an already allocated "struct something *" which contains your string, setting the "string" in "struct string_list_item" to some string in that struct, and the "util" to the struct itself, as we now own it and want to free() it later in its entirety. That and the even more common case I mentioned upthread of wanting to ferry around the truncated version of some char *, but still wanting to account for the original for an eventual free(). But yes, if you want to account for freeing that data *and* have util set to something else you'll need to have e.g. your own wrapper struct and your own string_list_clear_func() callback. I'm not suggesting that this handles every possible scenario, just that having look at a lot of the code involved recently this seemed like a neat solution for the common cases. >> We're not in the habit of passing loose "string_list_item" around now, >> but I don't see why we wouldn't (possibly with a change to extract that >> bit out, so we could use it in other places). > > It seems unnecessarily confusing to me. It sounds like you have a struct > which just _happens_ to have a "void *" in it you can re-use, so you > start using it in lots of other places that are not in fact string lists > at all. That is confusing to me on the face, but what happens when > string_list needs a feature which requires adding more fields to it? > > If the point is to have a maybe-allocated string, why not make that a > type itself? And then if we want string_list to use it, it can. *nod*, covered above. My examples were unnecessarily confusing... >> The neat thing about doing this is also that you're not left with every >> API boundary needing to deal with your new "def_string", a lot of them >> use string_list already, and hardly need to change anything, to the >> extent that we do need to change anything having a "void *util" is a lot >> more generally usable. You end up getting memory management for free as >> you gain a feature to pass arbitrary data along with your items. > > I don't think most interfaces take a string_list_item now, so wouldn't > they similarly need to be changed? Though the point is that all of these > degrade to a regular C-string, so when you are just passing the value > (and not ownership), you would just dereference at that point. Sure, just like things would need to be changed to handle your proposed "struct def_string". By piggy-backing on an already used struct in our codebase we can get a lot of that memory management pretty much for free without much churn. If you squint and pretend that "struct string_list_item" isn't called something to do with that particular collections API (but it would make use of it) then we've already set up most of the scaffolding and management for this.