Re: [RFC PATCH v2] shortlog: add group-by options for year and month

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 05, 2022 at 06:26:57PM -0400, Jeff King wrote:
> On Wed, Oct 05, 2022 at 05:43:20PM -0400, Taylor Blau wrote:
> 
> > This caught my attention, so I wanted to see how hard it would be to
> > implement. It actually is quite straightforward, and gets us most of the
> > way to being able to get the same functionality as in Jacob's patch
> > (minus being able to do the for-each-ref-style sub-selectors, like
> > `%(authordate:format=%Y-%m)`).

Thanks Taylor!! This looks awesome and helped me understand how the pretty
context stuff works. I was able to apply your patch locally and test,
and plan to continue working off of this :D. Like Peff mentioned seems to
be a few usage details to hammer out.

> The date thing I think can be done with --date; I just sent a sketch in
> another part of the thread.

Peff - I applied your --date sketch onto Taylor's patch and it worked first try.

> So here you're allowing multiple pretty options. But really, once we
> allow the user an arbitrary format, is there any reason for them to do:
> 
>   git shortlog --group=%an --group=%ad
> 
> versus just:
> 
>   git shortlog --group='%an %ad'
> 
> ?

Yes I can't think of an advantage of having multiple custom-formatted group
fields. Also see my note on this below related to your comment on specifying
multiple groups.

> >  void shortlog_add_commit(struct shortlog *log, struct commit *commit)
> >  {
> >  	struct strbuf ident = STRBUF_INIT;
> > @@ -243,6 +266,8 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
> >  	if (log->groups & SHORTLOG_GROUP_TRAILER) {
> >  		insert_records_from_trailers(log, &dups, commit, &ctx, oneline_str);
> >  	}
> > +	if (log->groups & SHORTLOG_GROUP_PRETTY)
> > +		insert_record_from_pretty(log, &dups, commit, &ctx, oneline_str);
> 
> I was puzzled at first that this was a bitwise check. But I forgot that
> we added support for --group options already, in 63d24fa0b0 (shortlog:
> allow multiple groups to be specified, 2020-09-27).
> 
> So a plan like:
> 
>   git shortlog --group=author --group=date
> 
> (as in the original patch in this thread) doesn't quite work, I think.

My first patch addressed this by specifically handling cases where the new
grouping options were passed in-tandem with existing options, and making sure
only a single shortlog group was generated. But if we're generalizing the custom
group format then it might be unecessary to even allow the custom group in tandem
with most other options (like 'author' and 'committer'), since those options can be
included in the custom group format. The trailer option might be an exception, but
that could possibly just be handled as a special case.

> We probably want to insist that the format contains a "%" sign, and/or
> git it a keyword like "format:". Otherwise a typo like:
> 
>   git shortlog --format=autor
> 
> stops being an error we detect, and just returns nonsense results
> (every commit has the same ident).

Small aside: I like how Taylor re-used the --group option for the custom format.
IMO it hammers home that this is a grouping option and not just formatting or
filtering which can be confusing to users sometimes when doing data analytics.

But your points here all still apply. Maybe detecting a "%" sign can be the way
that git identifies the custom format being passed in. In conjuction with the %
identifiers, this would still enable users to add in some arbitrary constant label
like --group='Year: %cd' --date='%Y', without affecting the grouped/sorted results
since all entries would then include the "Years: " prefix (or postfix or however
they decide to write it).

The one case that should probably be handled is when no % sign is used and no other
matching flags like --author or --committer are either, because currently that will
just group all commits under 1 nonsensical group name like "autor" as you mentioned.

> I think you'd want to detect SHORTLOG_GROUP_PRETTY in the
> read_from_stdin() path, too. And probably just die() with "not
> supported", like we do for trailers.

Glad you said this because I applied this in my original patch with the explicit
--year and --month groups. Didn't seem to be an obvious use-case with the stdin 
even though my original year and month values could possibly be read from the git
log output supplied as stdin. But for a generalized group format seems even more
far-fetched to try and make it jive with the stdin version of the command.

-Jack



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux