On Tue, Oct 11, 2016 at 12:07:40PM -0700, Linus Torvalds wrote: > On Tue, Oct 11, 2016 at 12:01 PM, Jeff King <peff@xxxxxxxx> wrote: > > > > My implementation is a little more complicated because it's also setting > > things up for grouping by trailers (so you can group by "signed-off-by", > > for example). I don't know if that's useful to your or not. > > Hmm. Maybe in theory. But probably not in reality - it's just not > unique enough (ie there are generally multiple, and if you choose the > first/last, it should be the same as author/committer, so it doesn't > actually add anything). The implementation I did credited each commit multiple times if the trailer appeared more than once. If you want to play with it, you can fetch it from: git://github.com/peff jk/shortlog-ident and then something like: git shortlog --ident=reviewed-by --format='...reviewed %an' works. I haven't found it to really be useful for more than toy statistic gathering, though. > There are possibly other things that *could* be grouped by and might be useful: > > - main subdirectory it touches (I've often wanted that) > > - rough size of diff or number of files it touches > > but realistically both are painful enough that it probably doesn't > make sense to do in some low-level helper. Yeah, I think there's a lot of policy there in what counts as "main", the rough sizes, etc. I've definitely done queries like that before, but usually by piping "log --numstat" into perl. It's a minor pain to get the data into perl data structures, but once you have it, you have a lot more flexibility in what you can compute. That might be aided by providing more structured machine-readable output from git, like JSON (which I don't particularly like, but it's kind-of a standard, and it sure as hell beats XML). But obviously that's another topic entirely. > > I'm fine with this less invasive version, but a few suggestions: > > > > - do you want to call it --group-by=committer (with --group-by=author > > as the default), which could later extend naturally to other forms of > > grouping? > > Honestly, it's probably the more generic one, but especially for > one-off commands that aren't that common, it's a pain to write. When > testing it, I literally just used "-c" for that reason. It's not the end of the world to call it "-c" now, and later define "-c" as a shorthand for "--group-by=committer", if and when the latter comes into existence. Keep in mind that shortlog takes arbitrary revision options, too, and "-c" is defined there for combined diffs. I can't think of a good reason to want to pass it to shortlog, though, so I don't think it's a big loss. -Peff