Re: On blame/pickaxe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Luben Tuikov <ltuikov@xxxxxxxxx> writes:

>> 3. Passing the blame.
>> 
>> You (a <commit,path> tuple) are suspected for introducing
>> certain lines, and you would want to pass blame to your parent.
>> How would you do that?
>> 
>> First, you find if your parent has the same path; if not, you
>> find if between your parent and you there was a rename and find
>> the original path in the parent.  If you are a merge, you do so
>> for all your parents.  The path in the parent and your path may
>> have many common lines, and if the lines you are the suspect are
>> the same as the ones in the parent, you can pass the blame,
>> because these lines were there before you touched them.
>
> Do you handle the case where a merge had a conflict and
> the user changed the code (resolved) and then committed?
> In this case some lines will have to be blamed on the
> merge commit itself.

Working on a small example by hand is a good way to convince
yourself.  The whole point of "try to pass the blame, and take
the blame yourself only when you can't pass to anybody" is
precisely to handle the merges sanely.  The answer to your later
question also would become crystal clear with that exercise.

Suppose that we are looking at a merge that would give this in
its "git show" output:

diff --cc hello.c
index 3c27792,db3fdef..cec80d2
--- a/hello.c
+++ b/hello.c
@@@ -1,4 -1,6 +1,6 @@@
  int main(int ac, char **av)
  {
-       printf("hello, world.\n");
 -      const char *msg = "hello, world";
++      const char *msg = "hello, world.";
+ 
+       printf("%s\n", msg);
  }

First, we inspect the diff from the first parent:

        diff --git a/hello.c b/hello.c
        index 3c27792..cec80d2 100644
        --- a/hello.c
        +++ b/hello.c
        @@ -1,4 +1,6 @@
1        int main(int ac, char **av)
2        {
        -       printf("hello, world.\n");
3       +       const char *msg = "hello, world.";
4       +
5       +       printf("%s\n", msg);
6        }

That would find that lines 1, 2 and 6 came from the first parent
(line numbers are of the postimage; e.g. line 6 is the closing
brace).

We are still left with lines 3, 4, and 5.  So we will see the
difference from the second parent:

        diff --git a/hello.c b/hello.c
        index db3fdef..cec80d2 100644
        --- a/hello.c
        +++ b/hello.c
        @@ -1,6 +1,6 @@
1        int main(int ac, char **av)
2        {
        -       const char *msg = "hello, world";
3       +       const char *msg = "hello, world.";
4
5               printf("%s\n", msg);
6        }

It shows that lines 1, 2, 4, 5 and 6 are the same as the second
parent (again, line numbers are of the postimage).  This means
that we _could_ attribute line 1, 2 and 6 to the second parent
if we wanted to, but we have already passed blame for 1, 2 and 6
to the first parent [*1*] and only lines 4 and 5 are assigned to
the second parent.

At this point, we have no more parents to pass blame on and are
still left with line 3.  So we end up taking the blame for that
line ourselves.  The final blame output reflects that.

If you are interested, prepare an example repository using the
attached script, and try annotating E, like this:

	git pickaxe --not right~2 left -- E

This demonstrates the example in this message (first parent is
Right and the second is Left).

Annotating C with blame and pickaxe (use "-n -f" for clarity)
shows the limitation of the original 'blame' that can use only
one path per commit.  This is a corner case where two files
originally different in the common ancestor were later merged
into one.  pickaxe handles this case without -M.

	git blame -n -f C
        git pickaxe -n -f C

Annotating D with pickaxe with and without -M illustrates how
line movement is handled.

	git pickaxe -M D

Have fun.


[*1*] The really core part of git does not have any preference
among parents, but typically a merge commit is made with the
current branch head as its first parent and the other branch as
its second parent, so favoring the earlier parent over the later
ones makes a lot of sense in practice.  This is in line with
other parts of git, including the merge simplification done by
git-log.

-- 8< --
#!/bin/sh

test -d .git || {
	echo Run me in an empty directory.
	exit
}
git init-db

for i in 1 2 3 4 5 6 7 8 9 ; do echo line from initial $i ; done >A
for i in A B C D E F G H I ; do echo line from initial $i ; done >B
cp A D
cat >E <<EOF
int main(int ac, char **av)
{
	printf("hello, world\n");
}
EOF

git add A B D E
git commit --author='Initial <initial@author>' -m initial

git branch right
git branch left

# Left
git checkout left
for i in 1 2 3; do echo added by left; done >C
cat A >>C
rm -f A B
cat >E <<EOF
int main(int ac, char **av)
{
	const char *msg = "hello, world";

	printf("%s\n", msg);
}
EOF
git update-index --add --remove A B C E
git commit --author='Left <left@branch>' -m Left

# Right
git checkout right
cat B >C
for i in 1 2 3; do echo added by right; done >>C
rm -f A B
cat >E <<EOF
int main(int ac, char **av)
{
	printf("hello, world.\n");
}
EOF
git update-index --add --remove A B C E
git commit --author='Right <right@branch>' -m Right

echo "Merge -- this should fail which is expected and scripts fixes it up"
echo "Do not get alarmed with the error message."
git pull . left
echo "Fixing up..."
{
	git cat-file blob :3:C
	echo line by evil merge
	git cat-file blob :2:C
} >C
cat >E <<EOF
int main(int ac, char **av)
{
	const char *msg = "hello, world.";

	printf("%s\n", msg);
}
EOF
git update-index C E
git commit --author='Merge <merge@branch>' -m 'Changes are merged.'
rm -f C~*

{
	for i in 5 6 7; do echo line from initial $i ; done
	echo line modified while swapping 8
	for i in 9 1 2 3 4 ; do echo line from initial $i ; done
} >D

git update-index D
git commit --author='Swap <swap@branch>' -m 'Lines are swapped.'

echo "Now try annotating C, D and E with various options."

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]