Re: [PATCH 1/2] shortlog: introduce `--group-filter` to restrict output

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 08, 2023 at 10:34:02AM -0400, Derrick Stolee wrote:
> On 6/7/2023 7:02 PM, Taylor Blau wrote:
>
> > This means that you could easily view the hashes of all commits you
> > either wrote or co-authored with something like:
> >
> >     $ git shortlog -n --group=author --group=trailer:Co-authored-by \
> >         --group-filter="$(git config user.name)"
> >
> > When filtering just by trailers, it is tempting to want to introduce a
> > new grep mode for matching a given trailer, like `--author=<pattern>`
> > for matching the author header. But this would not be suitable for the
> > above, since we want commits which match either the author or the
> > Co-authored-by trailer, not ones which match both.
>
> One thing that is not immediately obvious from reading the patch, but
> becomes clearer in patch 2, is that your --group-filter is an exact
> string match. This differs from the --author filter in 'git log' and
> similar, which is actually a case-insensitive substring match.

Yeah, funnily enough I originally thought about adding a `--trailer`
flag to the revision machinery, but realized that it was a non-starter
since `--author` is documented as only showing commits matching that
author.

So that doesn't solve for the case of finding a commit which has the
identity you want buried in some trailer, but is written by a different
author. shortlog needs to see all commits along the traversal, so I went
with the filtering approach instead.

Thanks for the pointer on beefing up the documentation to indicate that
this is an exact search.

> This is the critical piece of code for this issue. Replacing it with
>
> static int want_shortlog_group(struct shortlog *log, const char *group)
> {
> 	struct string_list_item *item;
> 	if (!log->group_filter.nr)
> 		return 1;
>
> 	for_each_string_list_item(item, &log->group_filter) {
> 		if (strcasestr(group, item->string))
> 			return 1;
> 	}
>
> 	return 0;
> }
>
> Results in the case-insensitive substring search that I would expect
> from this parameter.
>
> This would also solve the problem from Patch 2 where we want to search
> by email address. Using '-e --group-filter="my@xxxxxxxxx"' works, though
> it will catch users with 'tammy@xxxxxxxxx' emails, as well.

Yeah, thanks for raising it. I wonder if there are other semantics that
don't incorrectly return substring matches. You could conceivably extend
the --group-filter argument to take an extended regular expression, and
then just match on whatever's inside the last "<>".

That approach was one that I considered, but feels a little heavyweight
for what we're trying to do here. One potential approach is to keep
things as-is, and then consider extending `--group-filter` to mean
"extended regular expression" in the future. Although I'm not sure that
would work, either, since fixed-string matches would suddenly match
anywhere in the string, breaking backwards compatibility.

You could add another option that specifies the behavior of
`--group-filter`, or add some extra bit of information there like
`--group-filter=egrep:<pattern>` or something.

At least there's no shortage of options ;-).

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux