Am 17.12.22 um 14:24 schrieb Jeff King: > On Tue, Dec 13, 2022 at 07:31:13PM +0100, René Scharfe wrote: > >>> I think less of a hack is to teach the eventual parse_options() that >>> when it munges it it should free() it. I did that for the revisions API >>> in f92dbdbc6a8 (revisions API: don't leak memory on argv elements that >>> need free()-ing, 2022-08-02). >>> >>> What do you think? >> >> Generating string lists and then parsing them is weird. When calls have >> to cross a process boundary then we have no choice, but in-process we >> shouldn't have to lower our request to an intermediate text format. git >> am does it anyway because it writes its options to a file and reads them >> back when it resumes with --continue, IIUC. > > The argument has been made[1] in the past that the public API for the > revision machinery is not poking bits in the rev_info struct yourself, > but passing the appropriate options to setup_revisions(). My point was that we are stuck with the way it's done in git am specifically because of its on-disk parameter store. It forces us to be able to handle a parameter list that we allocate ourselves instead of being given one by the OS like argc/argv in main(). And this justifies a bespoke solution instead of changing parse_options(). In the meantime Ævar showed enough examples to convince me that a more general solution is indeed needed, but I still think we should keep parse_options() unchanged and add something like strvec_clone_nodup() instead. Regarding using textual interfaces internally in general: It's the easiest method, as we have to provide it for users anyway. It's not simple, though -- generating strings and managing their ownership adds overhead like this patch, which we wouldn't have in an struct interface. > And I can see the point; if option "--foo" requires twiddling both > revs.foo and revs.bar, then we have no way to enforce that callers have > remembered to do both. But if the only code which handles this is the > parser for "--foo", then we're OK. Depends on the specifics. Perhaps .bar is redundant and could be inferred from .foo. Or they could be replaced by an enum. > In a non-C language, we'd probably mark "foo" and "bar" as private and > provide a method to enable the option. We could provide a function, but > we'd have no compiler help to ensure that it was used over fiddling the > bits. Possibly calling them "foo_private" would be sufficient to make > people think twice about tweaking them, though. Or normalize the struct to avoid dependencies between fields. > [1] Apologies for the passive voice; I didn't want to muddle the > paragraph by discussing who and when. But I know this is an argument > Junio has made; I don't know if he still stands by it these days > (there is quite a lot of field-twiddling of rev_info by now). I'm > not sure to what degree I agree with it myself, but I thought it was > worth mentioning here. > >> If we have to change parse_options() at all then I'd prefer it to not >> free() anything (to keep it usable with main()'s parameters), but to >> reorder in a non-destructive way. That would mean keeping the NULL >> sentinel where it is, and making sure all callers use only the returned >> argc to determine which arguments parse_options() didn't recognize. > > My findings with -Wunused-parameter show that there are quite a lot of > places that take argv/argc but assume the NULL-termination of the argv. Right. > If we are just re-ordering argv, though, it feels like this could still > work with a strvec. Right now a strvec with "nr" items will free items 0 > through nr-1, assuming that v[nr] is its NULL invariant. But it could > free v[nr], too, in case the NULL was swapped into an earlier position. > > It's a little weird already, because that swap has violated the > invariant, so trying to strvec_push() onto it would cause confusing > results. But if the general use case is to pass the strvec to > parse_options(), get reordered, and then clear() it, it should work. If > you wanted to get really fancy, push() et al could double-check the > invariant and BUG(). Yes, parse_options() and strvec are not fitting perfectly. Changing the former to reorder the elements and keeping them all would require checking that all callers use the return value. Feels like a lot of work. A variant that takes a strvec and removes and frees parsed items from it would be clean, but would require refactoring indirect callers like apply_parse_options() users. Busywork. Making a shallow copy to give to parse_options() in callers that currently pass a strvec directly or indirectly seems like the simplest solution to me for now. René