Re: More precise tag following

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Sun, 28 Jan 2007 11:57:33 -0800 (PST)

On Sun, 28 Jan 2007, Junio C Hamano wrote:
> 
> Do you mean the perl-Gtk one by Jeff King?

Sorry, yeah, I'm just confused.

Where are my meds again?

> I was hoping to take a look at Shawn's git-gui and also perhaps
> looking into adding blame --incremental support to gitk myself
> when I have time, but unfortunately my day-job deadline is
> spilling into this weekend.

I think the nice thing about the new "git-blame --incremental" is that it 
allows people who really don't know (or care) anything at all about git 
internals to do the viewer. So you shouldn't need to care.

So I don't think you should do it, we should encourage others (who may not 
be comfy with writing hard-core C that touches subtle internal git issues) 
to just do it.

One thing I looked at, which *should* be easy to do inside "git-blame", is 
to make the case where you do *not* give a head to start with, default to 
"current working tree" instead of HEAD.

For example, say that I have changes in my working tree, and I do

	git blame-viewer <filename-that-is-dirty>

I think it would be nice if the *dirty* lines would actually get blamed to 
a fake commit (SHA-1 "00000000..") that is the "current working tree. 
Right now, it always starts from HEAD:filename, which may be how CVS/SVN 
annotate and friends work, but I actually think we could do better.

If you really want the annotation for the _committed_ state, you can 
always just say so explicitly:

	git blame-viewer HEAD <filename-that-may-be-dirty-but-who-cares>

No?

But for the actual viewer parts, which don't need internal git knowledge, 
let's just document the blame format, so that others can do it:

The new format is fairly easy to parse: each blame entry is always

 - starts with a line of

	<40-byte hex sha1> <sourceline> <resultline> <num_lines>

 - the first time that commit shows up in the stream, it has various
   other information about it printed out with a one-word tag at the 
   beginning of each line about that "extended commit info" (author, 
   email, committer, dates, summary etc)

 - each entry is _always_ finished by a

	"filename" <whitespace-quoted-filename-goes-here>

and thus it's really quite easy to parse for some line- and word-oriented 
parser (which should be quite natural for most scripting languages).

NOTE! For people who do parsing: to make it more robust, just ignore any 
lines in between the first and last one ("<sha1>" and "filename" lines) 
where you don't recognize the tag-words (or care about that particular 
one) at the beginning of the "extended information" lines. That way, if 
there is ever added information (like the commit encoding or extended 
commit commentary), a blame viewer won't ever care.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html