Re: [REVISED PATCH 2/6] Introduce commit notes

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Thu, 19 Jul 2007 10:20:34 -0700 (PDT)

On Wed, 18 Jul 2007, Junio C Hamano wrote:
> 
> Another anchoring clue you seem not to be exploiting fully is
> that the ASCII part must match "^[1-7][0-7]{4,5} " (mode bytes).

I did that on purpose.

The SHA1 *can* contain those characters too, so that's not really useful 
to us, and the only special character really is the NUL character (which 
is the only one cannot exists in the ASCII part - old-style trees can 
contain '/' too, although that's going away).

Also, the mode bytes may not be visible: if we start in a long filename, 
we'll never have looked at the mode bytes, but if we see a NUL character 
after having seen 20 non-NUL characters (long filename), we already know 
we got it. So I don't think we can even usefully use the other knowledge 
of the format of the ASCII part (other than to know it doesn't contain 
NUL's).

Of course, we can (and should) verify that the tree entry we find is 
valid, and *then* it makes sense to check the rules for the ASCII part. 
But that's only after we have already found the place.

> I was suggesting to have a specialized parser only to read such
> tree objects that are "abused" to represent notes.  You can
> cheaply validate that these trees are of expected shape.

Sure. That said, I'm less interested in the notes than I am in the cost fo 
"git blame", and that could be optimized by having some special code in 
"tree_entry_interesting()" to find the tree entries using binary search.

The special code would trigger only for:
 - large trees
 - "opt->nr_paths == 1"

but the latter case is the one that matters for blame in the first place, 
so..

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html