On Wed, 2011-03-02 at 11:40 -0500, Jeff King wrote: > [I know, I know, another RFC. I'll get to actually cleaning up and > submitting some of these patches soon.] > > It's sometimes useful to get a list of files in a tree along with the > last commit that touched them. This is the default tree view shown on > github.com, but it can also be handy from the command line (there has > been talk lately of having a "git ls"), or as plumbing for a local > fancier tree view. E.g., something like: > > add.c 6e7293e git-add: make -A description clearer vs. -u > apply.c fd03881 add description parameter to OPT__VERBOSE > blame.c 9ca1169 parse-options: Don't call parse_options_check() so much > branch.c 62270f6 branch_merged: fix grammar in warning > bundle.c 62b4698 Use angles for placeholders consistently > > The obvious naive way to do this is something like: > > for i in `git ls-tree --name-only HEAD`; do > echo "`git rev-list -1 --no-merges HEAD -- $i` $i"; > done > > which is really slow, because we end up traversing the same commits many > times (plus the startup overhead for each rev-list). It takes about 35 > seconds to run on git.git. > > So the next obvious thing is to do one traversal, output the changed > files for each commit, and then mark each file as you see it. The perl > script below does this (though the careful reader will note it is > actually buggy with sub-trees; I didn't bother fixing it since it was > just a stage in the evolution): > [code snipped] > > This runs in about 3 seconds. And besides the above-mentioned bug, > also doesn't properly handle things like filenames that need quoting. > > So I wrote it in C, which drops the time down to about 1.5 seconds, and > of course doesn't have any parsing issues. The patch is below. > > I wasn't sure at first what to call it or what the calling conventions > should be. The initial thought was to make it part of "ls-tree". But > that feels wrong, as ls-tree otherwise never cares about traversal. The > combination of traversal and diff made me think of blame, and indeed, I > think this is really just about blaming a whole tree at the file-level, > rather than at the content-level. Thus I called it blame-tree, and I > used the same calling conventions as blame: "git blame-tree <path> > <rev opts>". See the test script for examples. > > I have many thoughts on the patch already, but rather than put them > here, I'll include the patch without further ado, and put them inline in > a reply. > [patch snipped] Coincidentally, I'm doing a similar thing in a shell script at the moment. Unfortunately, no tree-object is involved: I'm instead using the output from "git diff" on two different branches to generate a list of files I care about. How hard would it be to accept a nul-delimited list of filenames via stdin, rather than from a tree? If I'm reading this right, it looks like a pretty trivial change. (I couldn't get the existing patch to apply, myself.. I assume I'm just doing something wrong as I don't need to use "git am" very often.) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html