On Wed, Oct 05, 2022 at 05:43:20PM -0400, Taylor Blau wrote: > > Heh, I was about to make the exact same suggestion. The existing > > "--group=author" could really just be "--group='%an <%ae>'" (or variants > > depending on the "-e" flag). > > This caught my attention, so I wanted to see how hard it would be to > implement. It actually is quite straightforward, and gets us most of the > way to being able to get the same functionality as in Jacob's patch > (minus being able to do the for-each-ref-style sub-selectors, like > `%(authordate:format=%Y-%m)`). Yeah, your patch is about what I'd expect. The date thing I think can be done with --date; I just sent a sketch in another part of the thread. > +static void insert_record_from_pretty(struct shortlog *log, > + struct strset *dups, > + struct commit *commit, > + struct pretty_print_context *ctx, > + const char *oneline) > +{ > + struct strbuf ident = STRBUF_INIT; > + size_t i; > + > + for (i = 0; i < log->pretty.nr; i++) { > + if (i) > + strbuf_addch(&ident, ' '); > + > + format_commit_message(commit, log->pretty.items[i].string, > + &ident, ctx); > + } So here you're allowing multiple pretty options. But really, once we allow the user an arbitrary format, is there any reason for them to do: git shortlog --group=%an --group=%ad versus just: git shortlog --group='%an %ad' ? > void shortlog_add_commit(struct shortlog *log, struct commit *commit) > { > struct strbuf ident = STRBUF_INIT; > @@ -243,6 +266,8 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit) > if (log->groups & SHORTLOG_GROUP_TRAILER) { > insert_records_from_trailers(log, &dups, commit, &ctx, oneline_str); > } > + if (log->groups & SHORTLOG_GROUP_PRETTY) > + insert_record_from_pretty(log, &dups, commit, &ctx, oneline_str); I was puzzled at first that this was a bitwise check. But I forgot that we added support for --group options already, in 63d24fa0b0 (shortlog: allow multiple groups to be specified, 2020-09-27). So a plan like: git shortlog --group=author --group=date (as in the original patch in this thread) doesn't quite work, I think. Because the semantics for multiple --group lines are that the commit is credited individually to each ident. That's what lets you do: git shortlog -ns --group=author --group=trailer:co-authored-by and credit authors and co-authors equally. So likewise, I think multiple group-format options don't really make sense (or at least, do not make sense to concatenate; you'd put each key in its own single format). > @@ -321,8 +346,10 @@ static int parse_group_option(const struct option *opt, const char *arg, int uns > else if (skip_prefix(arg, "trailer:", &field)) { > log->groups |= SHORTLOG_GROUP_TRAILER; > string_list_append(&log->trailers, field); > - } else > - return error(_("unknown group type: %s"), arg); > + } else { > + log->groups |= SHORTLOG_GROUP_PRETTY; > + string_list_append(&log->pretty, arg); > + } We probably want to insist that the format contains a "%" sign, and/or git it a keyword like "format:". Otherwise a typo like: git shortlog --format=autor stops being an error we detect, and just returns nonsense results (every commit has the same ident). I think you'd want to detect SHORTLOG_GROUP_PRETTY in the read_from_stdin() path, too. And probably just die() with "not supported", like we do for trailers. > I think you could also do some cleanup on top, like replacing the > SHORTLOG_GROUP_AUTHOR mode with adding either "%aN <%aE>" (or "%aN", > without --email) as an entry in the `pretty` string_list. Yeah, that would be a nice cleanup. I think might even be a good idea to explain the various options to the users in terms of "--author is equivalent to %aN <%aE>". It may help them understand how the tool works. -Peff