Re: [RFC] git blame-tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> writes:

> On Wed, Mar 02, 2011 at 11:40:32AM -0500, Jeff King wrote:

> > It's sometimes useful to get a list of files in a tree along with the
> > last commit that touched them. This is the default tree view shown on
> > github.com, but it can also be handy from the command line (there has
> > been talk lately of having a "git ls"), or as plumbing for a local
> > fancier tree view. E.g., something like:
> > 
> >      add.c 6e7293e git-add: make -A description clearer vs. -u
> >    apply.c fd03881 add description parameter to OPT__VERBOSE
> >    blame.c 9ca1169 parse-options: Don't call parse_options_check() so much
> >   branch.c 62270f6 branch_merged: fix grammar in warning
> >   bundle.c 62b4698 Use angles for placeholders consistently
[...] 

> >  blame-tree.c          |  156 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  blame-tree.h          |   25 ++++++++
> >  builtin.h             |    1 +
> >  builtin/blame-tree.c  |   34 +++++++++++
> 
> I tried to lib-ify the implementation as much as possible. It increases
> the lines of code, of course, but I figured there was a reasonable
> chance that there might be a user-friendly "git ls" command eventually,
> and this would probably make a good "-v" option to it.

I think it _might_ be a good idea to add `--blame' option to
"git ls-tree", as a one of ways of presenting tree-blame output.
Or perhaps as part of "git ls".

In "[RFC] Tree blame (git blame <directory>)"[1] I proposed for

  $ git blame --abbrev v1.6.3.3 -- .

to generate

  100644 blob e57630e ba19a80 Junio C Hamano      Feb 10 17:42   walker.c
  100644 blob 8a149e1 c13b263 Daniel Barkalow     Apr 26  2008   walker.h
  100644 blob 7eb3218 fc71db3 Alex Riesen         Apr 29 23:21   wrapper.c
  100644 blob 4c29255 559e840 Junio C Hamano      Jul 20  2008   write_or_die.c
  100644 blob 819c797 a437900 Junio C Hamano      Jun 21 02:35   ws.c
  100644 blob 1b6df45 2af202b Linus Torvalds      Jun 18 10:28   wt-status.c
  100644 blob 78add09 6c2ce04 Marius Storm-Olsen  Jun  5  2008   wt-status.h
  100644 blob b9b0db8 eb3a9dd Benjamin Kramer     Mar  7 21:02   xdiff-interface.c
  100644 blob 7352b9a 86295bb Rene Scharfe        Oct 25  2008   xdiff-interface.h
  040000 tree ef5d413 5719db9 Charles Bailey      May 25 01:21   xdiff/

or something like that.  Date doesn't have to be in this strange format
used by 'ls'.  Also instead of name we can use username part of email,
or just email; OTOH git-blame uses above format for author.

This could be result of "git ls-tree --abbrev --blame v1.6.3.3"...
and it could be combined with `-l' option of git-ls-tree.

[1]: http://article.gmane.org/gmane.comp.version-control.git/122830

> 
> I considered making it a special mode of "git blame" when blame is fed a
> directory instead of a file. But the implementations aren't shared at
> all (nor do I think they need to be; blame-tree is _way_ simpler). And I
> didn't want to steal that concept in case somebody can think of a more
> content-level way of blaming a whole tree that makes sense (obviously
> just showing the concatenation of the blames of each file is one way,
> but I don't know how useful that would be). If we want to go that way,
> we can always catch the special case in blame and just exec blame-tree.

Well, having "git blame [<rev>] <directory>" to output tre-blame
would allow to reuse some of already existing options to ordinary
git-blame; well those that makes sense, like `-b', `-S <revs-file>',
`--reverse', perhaps (depending on available output) also `-l', `-t',
`-s', `--date <format>'.

<rev> is here starting revision or revision range; if it is revision
range then negative specifiers function as boundary.

We could use `-M' to turn on rename detection, and `-C' to turn on
copy detection; I think that in tree-blame we need to consider only
_exact_ renames (pure renames, i.e. the same SHA-1, different name).

Also for GitHub (and perhaps also in the future for gitweb too) would
I think use `--porcelain' or even `--incremental' version of tree-blame;
in [1] I have proposed the following output (following existing "for
porcelain" format):

  $ git blame --porcelain v1.6.3.3 -- .

  86295bb6bac1482d29650d1f77f19d8e7a7cc2fe 7352b9a9c204c2b1d4ca9df5ce040fe22d6f521c
  author Rene Scharfe
  author-mail <rene.scharfe@xxxxxxxxxxxxxx>
  author-time 1224941475
  author-tz +0200
  committer Junio C Hamano
  committer-mail <gitster@xxxxxxxxx>
  committer-time 1224961771
  committer-tz -0700
  summary add xdi_diff_hunks() for callers that only need hunk lengths
  filename xdiff-interface.h
  100644 blob 7352b9a9c204c2b1d4ca9df5ce040fe22d6f521c    xdiff-interface.h
  5719db91ce5915ee07c50f1afdc94fe34e91529f ef5d413237b3a390007fba56671b00d7c371ae1e
  author Charles Bailey
  author-mail <charles@xxxxxxxxxxxxx>
  author-time 1243210874
  author-tz +0100
  committer Junio C Hamano
  committer-mail <gitster@xxxxxxxxx>
  committer-time 1243234594
  committer-tz -0700
  summary add xdi_diff_hunks() for callers that only need hunk lengths
  filename xdiff
  040000 tree ef5d413237b3a390007fba56671b00d7c371ae1e    xdiff

> > +void blame_tree_init(struct blame_tree *bt)
> > +{
> > +	memset(bt, 0, sizeof(*bt));
> > +	bt->paths.strdup_strings = 1;
> > +	init_revisions(&bt->rev, NULL);
> > +	bt->rev.no_merges = 1;
> > +	bt->rev.def = "HEAD";
> > +}
> 
> I turn off merges by default, since they are unlikely to be interesting
> matches (you will see the merge of a side-branch that touched a file
> instead of the actual commit on the side-branch). You could of course do
> "git blame-tree . --no-merges" to get the same effect. I think no-merges
> makes a saner default, but sadly it doesn't seem like there is a way to
> turn no-merges back off ("--merges" means something else, and there is
> no --no-no-merges").

IMHO merges are interecting; moreover if I remember correctly my proof
of concept of tree-blame which I tried to implement in Perl using
Git.pm (git cat-file --batch + git diff-tree --stdin), I have problems
with merges in tree-blame of subdirectory ("--relative" option doesn't
work as I thought it did).
 
> Right now the code just handles trees. But in the long run, it would
> probably make sense to get the list of files from the index, and mark
> files modified in the working tree or index, too. So something like:
> 
>   foo.c  1234abcd this is a commit subject
>   bar.c  modified in working tree
>   baz.c  modified in index
> 
> Sort of like how gitk shows "pseudo-commits" on top of history to
> indicate changes.

Or how "git blame" handles "--contents <file>" option... though what
you mentioned is more than that.

[...] 
> > +int cmd_blame_tree(int argc, const char **argv, const char *prefix)
> > +{
> > +	struct blame_tree bt;
> > +	const char *path = NULL;
> > +
> > +	git_config(git_default_config, NULL);
> > +
> > +	if (argv[1]) {
> > +		path = argv[1];
> > +		argc--;
> > +		argv++;
> > +	}
> 
> Obviously no options. Probably there should at least be "--porcelain" to
> output the current form, and the default output should be more
> user-friendly. And probably "-z" to avoid quoting issues.

Thank you for working on this.
-- 
Jakub Narebski
Poland
ShadeHawk on #git
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]