Re: `git blame` Line Number Off-by-one

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 07, 2020 at 01:05:51AM +0000, Nuthan Munaiah wrote:

>  * Clone https://github.com/apache/tomcat
>  * Run `git blame --root -leftw -L 21,21 -L 23,23
> 51844327d8613448bb0bf9667e1a61e462e2043c^ --
> modules/jdbc-pool/java/org/apache/tomcat/jdbc/pool/PoolProperties.java`
>
> [...]
> 
> `git blame` shows the last commit that modified lines 21 and 23 of
> `modules/jdbc-pool/java/org/apache/tomcat/jdbc/pool/PoolProperties.java`
> starting at the parent of `51844327d8613448bb0bf9667e1a61e462e2043c`.
>
> [...]
>
> Line 23 is not shown in the `git blame` output. Instead, line 22 is shown.

Thanks for providing an easy reproduction case.

I think the issue is not in the -L input or in the blame algorithm
itself, but in the hunk-coalescing at the end.

As you note, this shows up even with --porcelain:

  $ commit=51844327d8613448bb0bf9667e1a61e462e2043c^
  $ fn=modules/jdbc-pool/java/org/apache/tomcat/jdbc/pool/PoolProperties.java
  $ git blame --porcelain -L 21,21 -L 23,23 $commit -- $fn |
    egrep '^[0-9a-f]{40}'
  c65a429f06f4e4a025a306e377211863d9ff2a0c 21 21 2
  c65a429f06f4e4a025a306e377211863d9ff2a0c 22 22

but if we try --incremental:

  $ git blame --incremental -L 21,21 -L 23,23 $commit -- $fn |
    egrep '^[0-9a-f]{40}'
  c65a429f06f4e4a025a306e377211863d9ff2a0c 21 21 1
  c65a429f06f4e4a025a306e377211863d9ff2a0c 22 23 1

So we do know at the moment we find the line that it was at line 23 in
the final result, but line 22 in the earlier version at c65a429f06.

And indeed, running a non-incremental blame in a debugger, right before
calling blame_coalesce() our entries look like this:

  cmd_blame (argc=3, argv=0x7fffffffe458, prefix=0x0) at builtin/blame.c:1146
  1146		blame_coalesce(&sb);
  (gdb) print *sb->ent 
  $44 = {next = 0x55555596eda0, lno = 20, num_lines = 1, suspect = 0x555555999a30, s_lno = 20, score = 0, ignored = 0, 
    unblamable = 0}
  (gdb) print *sb->ent->next
  $45 = {next = 0x0, lno = 22, num_lines = 1, suspect = 0x555555999a30, s_lno = 21, score = 0, ignored = 0, 
    unblamable = 0}

So we have two one-line entries at lines 21 and 23 ("lno"; note that
internally we zero-index the lines), and we know that the second one is
actually from 22 ("s_lno").

But then after blame_coalesce() returns, we have only one entry with
both lines:

  (gdb) n
  1148		if (!(output_option & (OUTPUT_COLOR_LINE | OUTPUT_SHOW_AGE_WITH_COLOR)))
  (gdb) print *sb->ent
  $46 = {next = 0x0, lno = 20, num_lines = 2, suspect = 0x555555999a30, s_lno = 20, score = 0, ignored = 0, 
    unblamable = 0}

Presumably it saw the adjacent lines in the _source_ file and coalesced
them, but that's not the right thing to do. They're distinct hunks in
the output we're going to show, so they have to remain such.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux