On Tue, 5 Jan 2008, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Jakub Narebski <jnareb@xxxxxxxxx> writes: >> On Wed, 4 Jun 2008, Junio C Hamano wrote: >>> >>> Yes, but the current scheme breaks down in another way. When $full_rev >>> added many lines to the file, and you are adding the link to for a line >>> near the end of the file and such a line may not exist. This cannot be >>> cheaply done even inside blame itself. >> >> I think the scheme could be fixed by proposed below git-blame porcelain >> format output extension. > > For the line number information, I do not think so. I think I have not made myself clear enough. In the proposed additional blame format extension "previous" header (or "pre-image" header, or sth like that) would contain pre-image (before change) line number information, or information that line was added in blamed commit. > Luben's "continue from the same line number in the parent commit" is a > cute hack, but that strategy needs a qualifying comment "because hoping > that the same line number in the parent commit might have something > relevant would be better than stopping and giving up sometimes." It > cannot reliably work (and it is not Luben's fault). Therefore my proposal. I'm not sure though if it can be done cheaply. And Luben's idea is good enough in most cases; as a hint is definitely good enough. > But the #l<lno> fragment is just a hint to scroll to that point after > restarting the blame from previous commit and jumping to the result, so it > may not be too big a deal. Such a line may not exist in the resulting > blame page, but that's Ok. Let me give an example on how I visualized proposed "pre-image" header extension would work. Assume that we started from commit 'S' and commit 'B' is to be blamed for the line in question. Let's us assume that commit 'B' has only one parent, and that "context" diff between B^1 and B is available to blame. For an example, let's use [modified] example from GNU diff documentation (info): diff --git-context a/lao-tzu b/lao-tzu index 55fb100..198772c 100644 *** a/lao-tzu --- b/lao-tzu *************** *** 1,7 **** - The Way that can be told of is not the eternal Way; - The name that can be named is not the eternal name. The Nameless is the origin of Heaven and Earth; ! The Named is the mother of all things. Therefore let there always be non-being, so we may see their subtlety, And let there always be being, --- 1,6 ---- The Nameless is the origin of Heaven and Earth; ! The named is the mother of all things. ! Therefore let there always be non-being, so we may see their subtlety, And let there always be being, *************** *** 9,11 **** --- 8,13 ---- The two are the same, But after they are produced, But after they are produced, they have different names. + They both may be called deep and profound. + Deeper and more profound, + The door of all subtleties! First example: lets assume that we want find blame (annotation) for the following line: "The named is the mother of all things." Assuming that block of commonly blamed lines begin with given line, current blame output would look like the following: <sha-1 of commit 'B'> <current-lineno> 2 <block-size> author A U Thor author-mail <author@xxxxxxxxxxx> author-time 1150613103 author-tz -0700 committer C O Mitter committer-mail <committer@xxxxxxxxxxx> committer-time 1150690754 committer-tz -0700 filename lao-tzu summary Be even more cryptic The named is the mother of all things. What I wanted to add was the following header parents <sha-1 of commit 'B^1'> pre-image 2 4 lao-tzu where 2 is the line number in the commit given line is attributed to, where 4 is the line number of _corresponding_ line in pre-image (before change that gave examined line current form), and 'lao-tzu' is the name of file of pre-image for this line. Second example: lets assume that we want find blame (annotation) for the following line: "The door of all subtleties!" This time the line was added in the commit it is attributed to (commit blamed for this line), so there is no corresponding pre-image line. Extra headers would now look like the following: parents <sha-1 of commit 'B^1'> pre-image 13 - lao-tzu Of course 'parents' and 'pre-image' headers can be joined together in the 'previous' header you proposed. >>> Another breakage is even though $full_rev^ _may_ exist (iow, $full_rev >>> might not be the root commit), the file being blamed may not exist there >>> (iow $full_rev might have introduced the file). Instead of running >>> "rev-parse $full_rev^", you would at least need to ask "rev-list -1 >>> $full_rev^ -- $path" or something from the Porcelain layer, but >>> unfortunately this is rather expensive. >> >> Doesn't blame know revision graph for history of a given file already? > > Not in the sense of "rev-list -2 $full_rev -- $path | sed -e 1d". It > builds the graph as it digs deeper, and when it stops, it stopped digging, > so all it knows at that point without further computation is $full_rev^@, > and not "the previous commit that touched the path". > > But as Luben explained (and you drew a simple strand of pearls history to > illustrate), immediate parent is just for the purpose of restarting the > computation. What I worry about is what happens in the (rare I think) case when _merge_ commit is blamed, and firs-parent leg is simplified in the "per-file" history. For example if git-blame output for given line looks like below: 64625efeb1f216c3811230845bb519123ea0ddc5 2 2 1 author Jakub Narebski author-mail <jnareb@xxxxxxxxx> author-time 1212711688 author-tz +0200 committer Jakub Narebski committer-mail <jnareb@xxxxxxxxx> committer-time 1212711688 committer-tz +0200 filename foo summary Merge branch 'b' (Fullstop _and_ capitalization) Second line. and "git show 64625efeb1f216c3811230845bb519123ea0ddc5" is: commit 64625efeb1f216c3811230845bb519123ea0ddc5 Merge: 288f63a... ec70d8b... Author: Jakub Narebski <jnareb@xxxxxxxxx> Date: Fri Jun 6 02:21:28 2008 +0200 Merge branch 'b' (Fullstop _and_ capitalization) diff --cc foo index 11c1e59,2636666..888af51 --- a/foo +++ b/foo @@@ -1,3 -1,3 +1,3 @@@ First line - second line. -Second line ++Second line. Third line. What should be "previous" line then? Or perhaps there should be _two_ "previous" lines. >>> Because blame already almost knows if the commit the final blame lies on >>> has a parent, it would be reasonably cheap to add that "parent or nothing" >>> information to its --porcelain (and its --incremental) format if we wanted >>> to. >> >> It would be easy to add 'parents' header, perhaps empty if we blame >> root commit, or a boundary commit (do we say 'boundary' then?) when >> doing revision limited blaming. > > It shouldn't be too hard to say "parents of the blamed commit that has the > corresponding preimage of the file is this", and the history does not have > to be limited. You need to also handle "the commit that introduced the > path" case just like "root" and "boundary" that we cannot dig further than > that point. Errr... do your code deal with that case (no path before blamed commit)? > I'll follow this message up with two weatherballoon patches. Thanks a lot. It looks like step in good direction (implementing first proposal), with the caveat of attributed commit being evil merge. -- Jakub Narebski Poland -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html