Re: [PATCH v6] Add new git-related helper to contrib

Felipe Contreras <felipe.contreras@xxxxxxxxx> · Thu, 23 May 2013 17:59:37 -0500

On Thu, May 23, 2013 at 5:44 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Felipe Contreras <felipe.contreras@xxxxxxxxx> writes:
>
>>> Then in that "next iteration", we pick blame entry for A and decide
>>> to see if HEAD^, which is the suspect for A, can be exonerated of
>>> blames for _any_ (not just A) blame entry it currently is suspected
>>> for.  We call pass_blame() and it will find and process both A and
>>> C with a single "git diff-tree", attempting to pass blame to HEAD^^
>>> and its ancestors.
>>
>> All right, my code still works in that situation.
>
> When HEAD^ was processed, pass_blame() finds that the first line in
> A is attributable to HEAD^ (our current suspect) while the rest were
> copied from a different file in HEAD^^ .  All lines in C are found
> to be copy from a different file in HEAD^^.
>
> Then your scoreboard would have:
>
>   1. a blame entry that failed to pass blame to parents of HEAD^ (the
>      first line in A), which still has suspect == HEAD^
>
>   2. a blame entry that covers the remainder of A, blaming HEAD^^.
>
>   3. a blame entry that covers all of B, whose final guilty party is
>      known already.
>
>   4. a blame entry that covers all of C, blaming a different file in
>      HEAD^^.
>
> Your "Take responsibility" loop goes over these blame entries, sets
> found_guilty to true because of the first blame entry (the first
> line of A), and calls print_guilty_entry() for blame entry 1,
> showing that HEAD^ is guilty for the first line of A.
>
> After the loop, your "if we did not find anybody guilty" logic does
> not kick in, and the original line range for block A you saved in
> tmp_ent is not used.
>
> You lost the fact that the second and remainder of A were in this
> file at the commit HEAD^ but not in HEAD^^ (i.e. these lines were
> moved by HEAD^ from elsewhere).

> The fact that HEAD^ touched _something_ is not lost, so if _all_ you
> care about is "list all the commits, but I do not care about what
> line range was modified how", you can claim it "working",

The line range printed is correct.

> but that
> is a very limited definition of "working" and is not very reusable
> or extensible in order to help those like gitk that currently have
> to run two separate blames.  They need an output that lets them tell
> between
>
>  - this is the earliest we saw these lines in the path (it may have
>    been copied from another path, but this entry in the incremental
>    output stream is not about that final blame); and
>
>  - this is the final blame where these lines came from, possibly
>    from different path"
>
> and coalesce both kind of origin..

They can already do that by looking at the lines; if they get replaced
by another entry; they are not guilty.

Or we can add a 'guilty' line to the incremental output, that's trivial.

> Also the fact that the entire C was copied from elsewhere by HEAD^
> is lost but that is the same issue.  The next round will not find
> any blame entry that is !ent->guity because the call to pass_blame()
> for HEAD^ already handled the final blame not just for blame entries
> 1 and 2, but also for 4 during this round.

No, that's not true. If it's a different file the blame entry is not
found guilty.

> If the change in HEAD^ in the above example were to copy the whole A
> and C from a different file, then your !found_guilty logic would
> kick in and say all lines of A where copied from elsewhere in HEAD^,
> but again we would not learn the same information for C.

We would, when it's the turn for C, which is not guilty at this point.

> I do not think "I care only about a single bit per commit, i.e. if
> the commit touched the path; screw everybody else who wants to
> actually know where each line came from" can be called "working".

They can know where each line came from; the lines of each entry are
printed properly; that's the whole point of 'tmp_ent'.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html