Re: [RFC PATCH v2] shortlog: add group-by options for year and month

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 23, 2022 at 05:58:56PM -0400, Jeff King wrote:
> On Fri, Sep 23, 2022 at 09:17:10AM -0700, Junio C Hamano wrote:
>
> > Not a suggestion to use a different implementation or add a new
> > feature on top of this --group-by-time-range idea, but I wonder if
> > it is a more flexible and generalizeable approach to say "formulate
> > this value given by the --format=<format> string, apply this regular
> > expression match, and group by the subexpression value".  E.g.
> >
> >     git shortlog \
> > 	--group-by-value="%cI" \
> > 	--group-by-regexp="^(\d{4}-\d{2})"
>
> Heh, I was about to make the exact same suggestion. The existing
> "--group=author" could really just be "--group='%an <%ae>'" (or variants
> depending on the "-e" flag).

This caught my attention, so I wanted to see how hard it would be to
implement. It actually is quite straightforward, and gets us most of the
way to being able to get the same functionality as in Jacob's patch
(minus being able to do the for-each-ref-style sub-selectors, like
`%(authordate:format=%Y-%m)`).

Here's the patch:

--- >8 ---

diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index 7a1e1fe7c0..68880e8867 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -200,6 +200,29 @@ static void insert_records_from_trailers(struct shortlog *log,
 	unuse_commit_buffer(commit, commit_buffer);
 }

+static void insert_record_from_pretty(struct shortlog *log,
+				      struct strset *dups,
+				      struct commit *commit,
+				      struct pretty_print_context *ctx,
+				      const char *oneline)
+{
+	struct strbuf ident = STRBUF_INIT;
+	size_t i;
+
+	for (i = 0; i < log->pretty.nr; i++) {
+		if (i)
+			strbuf_addch(&ident, ' ');
+
+		format_commit_message(commit, log->pretty.items[i].string,
+				      &ident, ctx);
+	}
+
+	if (strset_add(dups, ident.buf))
+		insert_one_record(log, ident.buf, oneline);
+
+	strbuf_release(&ident);
+}
+
 void shortlog_add_commit(struct shortlog *log, struct commit *commit)
 {
 	struct strbuf ident = STRBUF_INIT;
@@ -243,6 +266,8 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
 	if (log->groups & SHORTLOG_GROUP_TRAILER) {
 		insert_records_from_trailers(log, &dups, commit, &ctx, oneline_str);
 	}
+	if (log->groups & SHORTLOG_GROUP_PRETTY)
+		insert_record_from_pretty(log, &dups, commit, &ctx, oneline_str);

 	strset_clear(&dups);
 	strbuf_release(&ident);
@@ -321,8 +346,10 @@ static int parse_group_option(const struct option *opt, const char *arg, int uns
 	else if (skip_prefix(arg, "trailer:", &field)) {
 		log->groups |= SHORTLOG_GROUP_TRAILER;
 		string_list_append(&log->trailers, field);
-	} else
-		return error(_("unknown group type: %s"), arg);
+	} else {
+		log->groups |= SHORTLOG_GROUP_PRETTY;
+		string_list_append(&log->pretty, arg);
+	}

 	return 0;
 }
@@ -340,6 +367,7 @@ void shortlog_init(struct shortlog *log)
 	log->in2 = DEFAULT_INDENT2;
 	log->trailers.strdup_strings = 1;
 	log->trailers.cmp = strcasecmp;
+	log->pretty.strdup_strings = 1;
 }

 int cmd_shortlog(int argc, const char **argv, const char *prefix)
diff --git a/shortlog.h b/shortlog.h
index 3f7e9aabca..d7caecb76f 100644
--- a/shortlog.h
+++ b/shortlog.h
@@ -20,8 +20,10 @@ struct shortlog {
 		SHORTLOG_GROUP_AUTHOR = (1 << 0),
 		SHORTLOG_GROUP_COMMITTER = (1 << 1),
 		SHORTLOG_GROUP_TRAILER = (1 << 2),
+		SHORTLOG_GROUP_PRETTY = (1 << 3),
 	} groups;
 	struct string_list trailers;
+	struct string_list pretty;

 	int email;
 	struct string_list mailmap;

--- 8< ---

> I don't think you even really need the regexp. If we respect --date,
> then you should be able to ask for --date=format:%Y-%m. Unfortunately
> there's no way to specify the format as part of the placeholder. The
> for-each-ref formatter understands this, like:
>
>   %(authordate:format:%Y-%m)
>
> I wouldn't be opposed to teaching the git-log formatter something
> similar.

Yeah, I think having a similar mechanism there would be useful, and
certainly a prerequisite to being able to achieve what Jacob has done
here with the more general approach.

I think you could also do some cleanup on top, like replacing the
SHORTLOG_GROUP_AUTHOR mode with adding either "%aN <%aE>" (or "%aN",
without --email) as an entry in the `pretty` string_list.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux