On Wed, Oct 16, 2024 at 10:47:03AM +0200, Wolfgang Müller wrote: > On 2024-10-16 07:32, Patrick Steinhardt wrote: > > On Tue, Oct 15, 2024 at 01:48:26PM +0200, Wolfgang Müller wrote: > > > diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh > > > index c20c885724..ed39c67ba1 100755 > > > --- a/t/t4201-shortlog.sh > > > +++ b/t/t4201-shortlog.sh > > > @@ -143,6 +143,11 @@ fuzz() > > > test_grep "too many arguments" out > > > ' > > > > > > +test_expect_success 'shortlog --author from non-git directory does not segfault' ' > > > + git log --no-expand-tabs HEAD >log && > > > + env GIT_DIR=non-existing git shortlog --author=author <log 2>out > > > +' > > > + > > > > I'd like to see another testcase added that exercises behaviour when > > git-shortlog(1) is passed SHA256 output outside of a repo, as I'm > > curious how it'll behave in that scenario. > > I had a look at this in builtin/shortlog.c's read_from_stdin() and am > pretty sure that git-shortlog(1), when reading from stdin, simply > ignores anything but either the "Author:" or "Commit:" lines, depending > on the value given by --group. The --group=format: option is not > supported when reading from stdin. Neither is --format handled at all. > > So I don't think there is actually a way to make git-shortlog(1) > encounter and handle a commit hash when reading from stdin; the hash > algorithm seems completely meaningless for its user-facing behaviour. As > far as I have seen the closest it comes to getting into contact with a > hash (or more specifically, hexsz) is when cmd_shortlog() sets: > > log.abbrev = rev.abbrev; > > This relies on the parsing machinery in parse_revision_opt() - the one > this patch is for. Technically --abbrev is honored by git-shortlog(1) > when reading from stdin, but its value goes unused because of the > difference in code paths when reading from stdin. Do take this with a > grain of salt, however, this is my first look at the inner workings of > git-shortlog(1). Okay, thanks for the explanation. Given that we do set `log.abbrev` I think we should be hitting code paths in git-shortlog(1) that use it. `git shortlog --format=%h` for example would use `log.abbrev`, wouldn't it? It would be nice to figure out whether this can be made to misbehave based on which object hash we have in the input. > As for the test, I'd be happy to provide one if this is still deemed > necessary after considering the above. There's two questions I have: > > 1) Is this already covered by GIT_TEST_DEFAULT_HASH=sha256? Running the > t4201-shortlog testsuite with that passes. I think it doesn't hurt to have an explicit test for this scenario, even if it just demonstrates that things don't crash or behave in weird ways. > 2) I've already experimented with setting up a test for this and am > unsure how to cleanly set up a sha256 repository. Ordinarily it should > be a simple init/add (perhaps with test_commit), but t4201-shortlog is > already running within a git repository if I understand the setup step > correctly. Is there a clean way to escape from there, or would it simply > be fine creating another repository within that one? You can take e.g. b2dbf97f47 (builtin/index-pack: fix segfaults when running outside of a repo, 2024-09-04) as inspiration for how to achieve this. Patrick