On Wed, Oct 05, 2022 at 06:26:57PM -0400, Jeff King wrote: > On Wed, Oct 05, 2022 at 05:43:20PM -0400, Taylor Blau wrote: > > > This caught my attention, so I wanted to see how hard it would be to > > implement. It actually is quite straightforward, and gets us most of the > > way to being able to get the same functionality as in Jacob's patch > > (minus being able to do the for-each-ref-style sub-selectors, like > > `%(authordate:format=%Y-%m)`). Thanks Taylor!! This looks awesome and helped me understand how the pretty context stuff works. I was able to apply your patch locally and test, and plan to continue working off of this :D. Like Peff mentioned seems to be a few usage details to hammer out. > The date thing I think can be done with --date; I just sent a sketch in > another part of the thread. Peff - I applied your --date sketch onto Taylor's patch and it worked first try. > So here you're allowing multiple pretty options. But really, once we > allow the user an arbitrary format, is there any reason for them to do: > > git shortlog --group=%an --group=%ad > > versus just: > > git shortlog --group='%an %ad' > > ? Yes I can't think of an advantage of having multiple custom-formatted group fields. Also see my note on this below related to your comment on specifying multiple groups. > > void shortlog_add_commit(struct shortlog *log, struct commit *commit) > > { > > struct strbuf ident = STRBUF_INIT; > > @@ -243,6 +266,8 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit) > > if (log->groups & SHORTLOG_GROUP_TRAILER) { > > insert_records_from_trailers(log, &dups, commit, &ctx, oneline_str); > > } > > + if (log->groups & SHORTLOG_GROUP_PRETTY) > > + insert_record_from_pretty(log, &dups, commit, &ctx, oneline_str); > > I was puzzled at first that this was a bitwise check. But I forgot that > we added support for --group options already, in 63d24fa0b0 (shortlog: > allow multiple groups to be specified, 2020-09-27). > > So a plan like: > > git shortlog --group=author --group=date > > (as in the original patch in this thread) doesn't quite work, I think. My first patch addressed this by specifically handling cases where the new grouping options were passed in-tandem with existing options, and making sure only a single shortlog group was generated. But if we're generalizing the custom group format then it might be unecessary to even allow the custom group in tandem with most other options (like 'author' and 'committer'), since those options can be included in the custom group format. The trailer option might be an exception, but that could possibly just be handled as a special case. > We probably want to insist that the format contains a "%" sign, and/or > git it a keyword like "format:". Otherwise a typo like: > > git shortlog --format=autor > > stops being an error we detect, and just returns nonsense results > (every commit has the same ident). Small aside: I like how Taylor re-used the --group option for the custom format. IMO it hammers home that this is a grouping option and not just formatting or filtering which can be confusing to users sometimes when doing data analytics. But your points here all still apply. Maybe detecting a "%" sign can be the way that git identifies the custom format being passed in. In conjuction with the % identifiers, this would still enable users to add in some arbitrary constant label like --group='Year: %cd' --date='%Y', without affecting the grouped/sorted results since all entries would then include the "Years: " prefix (or postfix or however they decide to write it). The one case that should probably be handled is when no % sign is used and no other matching flags like --author or --committer are either, because currently that will just group all commits under 1 nonsensical group name like "autor" as you mentioned. > I think you'd want to detect SHORTLOG_GROUP_PRETTY in the > read_from_stdin() path, too. And probably just die() with "not > supported", like we do for trailers. Glad you said this because I applied this in my original patch with the explicit --year and --month groups. Didn't seem to be an obvious use-case with the stdin even though my original year and month values could possibly be read from the git log output supplied as stdin. But for a generalized group format seems even more far-fetched to try and make it jive with the stdin version of the command. -Jack