Re: [RFC PATCH v2] builtin/shortlog: explicitly set hash algo when there is no repo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 16, 2024 at 10:47:03AM +0200, Wolfgang Müller wrote:
> On 2024-10-16 07:32, Patrick Steinhardt wrote:
> > On Tue, Oct 15, 2024 at 01:48:26PM +0200, Wolfgang Müller wrote:
> > > diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh
> > > index c20c885724..ed39c67ba1 100755
> > > --- a/t/t4201-shortlog.sh
> > > +++ b/t/t4201-shortlog.sh
> > > @@ -143,6 +143,11 @@ fuzz()
> > >  	test_grep "too many arguments" out
> > >  '
> > >  
> > > +test_expect_success 'shortlog --author from non-git directory does not segfault' '
> > > +	git log --no-expand-tabs HEAD >log &&
> > > +	env GIT_DIR=non-existing git shortlog --author=author <log 2>out
> > > +'
> > > +
> > 
> > I'd like to see another testcase added that exercises behaviour when
> > git-shortlog(1) is passed SHA256 output outside of a repo, as I'm
> > curious how it'll behave in that scenario.
> 
> I had a look at this in builtin/shortlog.c's read_from_stdin() and am
> pretty sure that git-shortlog(1), when reading from stdin, simply
> ignores anything but either the "Author:" or "Commit:" lines, depending
> on the value given by --group. The --group=format: option is not
> supported when reading from stdin. Neither is --format handled at all.
> 
> So I don't think there is actually a way to make git-shortlog(1)
> encounter and handle a commit hash when reading from stdin; the hash
> algorithm seems completely meaningless for its user-facing behaviour. As
> far as I have seen the closest it comes to getting into contact with a
> hash (or more specifically, hexsz) is when cmd_shortlog() sets:
> 
> 	log.abbrev = rev.abbrev;
> 
> This relies on the parsing machinery in parse_revision_opt() - the one
> this patch is for. Technically --abbrev is honored by git-shortlog(1)
> when reading from stdin, but its value goes unused because of the
> difference in code paths when reading from stdin. Do take this with a
> grain of salt, however, this is my first look at the inner workings of
> git-shortlog(1).

Okay, thanks for the explanation.

Given that we do set `log.abbrev` I think we should be hitting code
paths in git-shortlog(1) that use it. `git shortlog --format=%h` for
example would use `log.abbrev`, wouldn't it? It would be nice to figure
out whether this can be made to misbehave based on which object hash we
have in the input.

> As for the test, I'd be happy to provide one if this is still deemed
> necessary after considering the above. There's two questions I have:
> 
> 1) Is this already covered by GIT_TEST_DEFAULT_HASH=sha256? Running the
> t4201-shortlog testsuite with that passes.

I think it doesn't hurt to have an explicit test for this scenario, even
if it just demonstrates that things don't crash or behave in weird ways.

> 2) I've already experimented with setting up a test for this and am
> unsure how to cleanly set up a sha256 repository. Ordinarily it should
> be a simple init/add (perhaps with test_commit), but t4201-shortlog is
> already running within a git repository if I understand the setup step
> correctly. Is there a clean way to escape from there, or would it simply
> be fine creating another repository within that one?

You can take e.g. b2dbf97f47 (builtin/index-pack: fix segfaults when
running outside of a repo, 2024-09-04) as inspiration for how to achieve
this.

Patrick




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux