Re: [PATCH v6] Add new git-related helper to contrib

Junio C Hamano <gitster@xxxxxxxxx> · Thu, 23 May 2013 15:44:16 -0700

Felipe Contreras <felipe.contreras@xxxxxxxxx> writes:

>> Then in that "next iteration", we pick blame entry for A and decide
>> to see if HEAD^, which is the suspect for A, can be exonerated of
>> blames for _any_ (not just A) blame entry it currently is suspected
>> for.  We call pass_blame() and it will find and process both A and
>> C with a single "git diff-tree", attempting to pass blame to HEAD^^
>> and its ancestors.
>
> All right, my code still works in that situation.

When HEAD^ was processed, pass_blame() finds that the first line in
A is attributable to HEAD^ (our current suspect) while the rest were
copied from a different file in HEAD^^ .  All lines in C are found
to be copy from a different file in HEAD^^.

Then your scoreboard would have:

  1. a blame entry that failed to pass blame to parents of HEAD^ (the
     first line in A), which still has suspect == HEAD^

  2. a blame entry that covers the remainder of A, blaming HEAD^^.

  3. a blame entry that covers all of B, whose final guilty party is
     known already.

  4. a blame entry that covers all of C, blaming a different file in
     HEAD^^.

Your "Take responsibility" loop goes over these blame entries, sets
found_guilty to true because of the first blame entry (the first
line of A), and calls print_guilty_entry() for blame entry 1,
showing that HEAD^ is guilty for the first line of A.

After the loop, your "if we did not find anybody guilty" logic does
not kick in, and the original line range for block A you saved in
tmp_ent is not used.

You lost the fact that the second and remainder of A were in this
file at the commit HEAD^ but not in HEAD^^ (i.e. these lines were
moved by HEAD^ from elsewhere).

The fact that HEAD^ touched _something_ is not lost, so if _all_ you
care about is "list all the commits, but I do not care about what
line range was modified how", you can claim it "working", but that
is a very limited definition of "working" and is not very reusable
or extensible in order to help those like gitk that currently have
to run two separate blames.  They need an output that lets them tell
between

 - this is the earliest we saw these lines in the path (it may have
   been copied from another path, but this entry in the incremental
   output stream is not about that final blame); and

 - this is the final blame where these lines came from, possibly
   from different path"

and coalesce both kind of origin..

Also the fact that the entire C was copied from elsewhere by HEAD^
is lost but that is the same issue.  The next round will not find
any blame entry that is !ent->guity because the call to pass_blame()
for HEAD^ already handled the final blame not just for blame entries
1 and 2, but also for 4 during this round.

If the change in HEAD^ in the above example were to copy the whole A
and C from a different file, then your !found_guilty logic would
kick in and say all lines of A where copied from elsewhere in HEAD^,
but again we would not learn the same information for C.

I do not think "I care only about a single bit per commit, i.e. if
the commit touched the path; screw everybody else who wants to
actually know where each line came from" can be called "working".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html