Re: [PATCH] Avoid errors from git-rev-parse in gitweb blame

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 5 Jan 2008, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Jakub Narebski <jnareb@xxxxxxxxx> writes:
>> On Wed, 4 Jun 2008, Junio C Hamano wrote:
>>>
>>> Yes, but the current scheme breaks down in another way.  When $full_rev
>>> added many lines to the file, and you are adding the link to for a line
>>> near the end of the file and such a line may not exist.  This cannot be
>>> cheaply done even inside blame itself.
>>
>> I think the scheme could be fixed by proposed below git-blame porcelain
>> format output extension.
> 
> For the line number information, I do not think so.

I think I have not made myself clear enough.  In the proposed additional
blame format extension "previous" header (or "pre-image" header, or sth
like that) would contain pre-image (before change) line number
information, or information that line was added in blamed commit.

> Luben's "continue from the same line number in the parent commit" is a
> cute hack, but that strategy needs a qualifying comment "because hoping
> that the same line number in the parent commit might have something
> relevant would be better than stopping and giving up sometimes."  It
> cannot reliably work (and it is not Luben's fault).

Therefore my proposal.  I'm not sure though if it can be done cheaply.
And Luben's idea is good enough in most cases; as a hint is definitely
good enough.

> But the #l<lno> fragment is just a hint to scroll to that point after
> restarting the blame from previous commit and jumping to the result, so it
> may not be too big a deal.  Such a line may not exist in the resulting
> blame page, but that's Ok.

Let me give an example on how I visualized proposed "pre-image" header
extension would work.

Assume that we started from commit 'S' and commit 'B' is to be blamed
for the line in question.  Let's us assume that commit 'B' has only
one parent, and that "context" diff between B^1 and B is available to
blame.  For an example, let's use [modified] example from GNU diff
documentation (info):

     diff --git-context a/lao-tzu b/lao-tzu
     index 55fb100..198772c 100644
     *** a/lao-tzu
     --- b/lao-tzu
     ***************
     *** 1,7 ****
     - The Way that can be told of is not the eternal Way;
     - The name that can be named is not the eternal name.
       The Nameless is the origin of Heaven and Earth;
     ! The Named is the mother of all things.
       Therefore let there always be non-being,
         so we may see their subtlety,
       And let there always be being,
     --- 1,6 ----
       The Nameless is the origin of Heaven and Earth;
     ! The named is the mother of all things.
     !
       Therefore let there always be non-being,
         so we may see their subtlety,
       And let there always be being,
     ***************
     *** 9,11 ****
     --- 8,13 ----
       The two are the same,
       But after they are produced,
       But after they are produced,
         they have different names.
     + They both may be called deep and profound.
     + Deeper and more profound,
     + The door of all subtleties!

First example: lets assume that we want find blame (annotation) for
the following line:
      "The named is the mother of all things."
Assuming that block of commonly blamed lines begin with given line,
current blame output would look like the following:

  <sha-1 of commit 'B'> <current-lineno> 2 <block-size>
  author A U Thor
  author-mail <author@xxxxxxxxxxx>
  author-time 1150613103
  author-tz -0700
  committer C O Mitter
  committer-mail <committer@xxxxxxxxxxx>
  committer-time 1150690754
  committer-tz -0700
  filename lao-tzu
  summary Be even more cryptic
  	The named is the mother of all things.

What I wanted to add was the following header

  parents <sha-1 of commit 'B^1'>
  pre-image 2 4 lao-tzu

where 2 is the line number in the commit given line is attributed to,
where 4 is the line number of _corresponding_ line in pre-image (before
change that gave examined line current form), and 'lao-tzu' is the name
of file of pre-image for this line.


Second example: lets assume that we want find blame (annotation) for
the following line:
      "The door of all subtleties!"

This time the line was added in the commit it is attributed to (commit
blamed for this line), so there is no corresponding pre-image line.
Extra headers would now look like the following:

  parents <sha-1 of commit 'B^1'>
  pre-image 13 - lao-tzu


Of course 'parents' and 'pre-image' headers can be joined together in
the 'previous' header you proposed.

>>> Another breakage is even though $full_rev^ _may_ exist (iow, $full_rev
>>> might not be the root commit), the file being blamed may not exist there
>>> (iow $full_rev might have introduced the file).  Instead of running
>>> "rev-parse $full_rev^", you would at least need to ask "rev-list -1
>>> $full_rev^ -- $path" or something from the Porcelain layer, but
>>> unfortunately this is rather expensive.
>>
>> Doesn't blame know revision graph for history of a given file already?
> 
> Not in the sense of "rev-list -2 $full_rev -- $path | sed -e 1d".  It
> builds the graph as it digs deeper, and when it stops, it stopped digging,
> so all it knows at that point without further computation is $full_rev^@,
> and not "the previous commit that touched the path".
> 
> But as Luben explained (and you drew a simple strand of pearls history to
> illustrate), immediate parent is just for the purpose of restarting the
> computation.

What I worry about is what happens in the (rare I think) case when
_merge_ commit is blamed, and firs-parent leg is simplified in the
"per-file" history.


For example if git-blame output for given line looks like below:

  64625efeb1f216c3811230845bb519123ea0ddc5 2 2 1
  author Jakub Narebski
  author-mail <jnareb@xxxxxxxxx>
  author-time 1212711688
  author-tz +0200
  committer Jakub Narebski
  committer-mail <jnareb@xxxxxxxxx>
  committer-time 1212711688
  committer-tz +0200
  filename foo
  summary Merge branch 'b' (Fullstop _and_ capitalization)
  	Second line.

and "git show 64625efeb1f216c3811230845bb519123ea0ddc5" is:

  commit 64625efeb1f216c3811230845bb519123ea0ddc5
  Merge: 288f63a... ec70d8b...
  Author: Jakub Narebski <jnareb@xxxxxxxxx>
  Date:   Fri Jun 6 02:21:28 2008 +0200
  
      Merge branch 'b' (Fullstop _and_ capitalization)

  diff --cc foo
  index 11c1e59,2636666..888af51
  --- a/foo
  +++ b/foo
  @@@ -1,3 -1,3 +1,3 @@@
    First line
  - second line.
   -Second line
  ++Second line.
    Third line.

What should be "previous" line then? Or perhaps there should be _two_
"previous" lines.

>>> Because blame already almost knows if the commit the final blame lies on
>>> has a parent, it would be reasonably cheap to add that "parent or nothing"
>>> information to its --porcelain (and its --incremental) format if we wanted
>>> to.
>>
>> It would be easy to add 'parents' header, perhaps empty if we blame
>> root commit, or a boundary commit (do we say 'boundary' then?) when
>> doing revision limited blaming.
> 
> It shouldn't be too hard to say "parents of the blamed commit that has the
> corresponding preimage of the file is this", and the history does not have
> to be limited.  You need to also handle "the commit that introduced the
> path" case just like "root" and "boundary" that we cannot dig further than
> that point.

Errr... do your code deal with that case (no path before blamed commit)?
 
> I'll follow this message up with two weatherballoon patches.

Thanks a lot. It looks like step in good direction (implementing first
proposal), with the caveat of attributed commit being evil merge.

-- 
Jakub Narebski
Poland
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux