On Mon, Oct 10, 2022 at 09:48:47PM -0400, Taylor Blau wrote: > I think that shortlog_finish_setup() is probably the most reasonable > choice (and I changed it to that locally). I spent a day or so thinking > on and off about this while writing the series, and didn't come up with > any satisfying names. > > I think it points to something deeper about the API that doesn't quite > sit right with me. But shortlog_finish_setup() is the least-unsatisfying > name so far, so let's go with that ;-). Yeah, touching that block in make_cover_letter() definitely tickled my memory that there was some awkwardness there. That's when I added shortlog_init() at all (which is good, because otherwise you'd have had an uninitialized log.format string-list). I think the general pattern of: foo_init(); foo.option = whatever; foo_finish_setup(); foo_do_the_thing(); is OK. It's a little complicated, but it gives callers the freedom to tweak options with a lot of flexibility (e.g., based on command line config, command line options, etc). We have a similar pattern with diff_setup_done(). The other option is for any option-tweaking to go through methods which maintain invariants (like if GROUP_AUTHOR is set, then the "%an" format has been added). That's usually the "right" object-oriented way to do it. But I think even in this simple case it gets awkward, because the correct format can't be known until you see whether log->email is set. So it really has to be a "finalize" step after both log->email and log->group are set. > > This loses the HAS_MULTI_BITS() optimization. The idea there is that if > > you have a single group-by that can't produce multiple outputs, then > > there's no need to do duplicate detection. > > > > The equivalent in an all-formats world is something like: > > > > log.format.nr > 1 && !log.trailers.nr > > > > (because trailers are special in that one trailer key can produce > > multiple idents for a single commit). > > Hmm. I suspect that this is going to become more complicated by the time > that you read the final patch ;-). Let's wait until then and see what > you think. Yes, it's pretty gross with the trailer util thing. You'd probably want to do something like: static int needs_dedup(const struct string_list *formats) { struct string_list_item *item; if (formats->nr > 1) return 1; for_each_string_list_item(item, formats) { if (item->util) return 1; } return 0; } and perhaps call it only once and cache the result, rather than iterating for each commit/format. But if we leave trailers as is, then the logic is much easier. And implementing the optimization for --group=format should fall out pretty naturally (and that both preserves it for --group=author, and extends it to --group=format, which I should have noticed when reviewing patch 4). > > > @@ -439,8 +440,8 @@ int cmd_shortlog(int argc, const char **argv, const char *prefix) > > > log.file = rev.diffopt.file; > > > log.date_mode = rev.date_mode; > > > > > > - if (!log.groups) > > > - log.groups = SHORTLOG_GROUP_AUTHOR; > > > + shortlog_init_group(&log); > > > + > > > string_list_sort(&log.trailers); > > > > Now that we have a "finish the shortlog init" function, probably this > > trailer sort should go into it, too. The current caller doesn't need it, > > but it removes on more gotcha from using the shortlog API. > > We'll drop this list by the end of the series, too, so it probably isn't > worth moving it into shortlog_finish_setup() in the interim. Ah, right. Well, see my other comments. :) -Peff