[PATCH] gitweb: Fix and simplify "split patch" detection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



There are some cases when one line from "raw" git-diff output (raw format)
corresponds to more than one patch in the patchset git-diff output; we
call this situation "split patch". Current code misdetected subsequent
patches (for different files) with the same pre-image and post-image
as fragments of "split patch", leading to mislabeled from-file/to-file
diff header etc.


We could parse from_file and to_file (if to_file exists) from the "git
diff header" in the patch, i.e. from the "^diff" line in patchset, and
consider patch "split" or "continued" if not only pre-image and
post-image (from_id and to_id) matches, but only when also from_file
and to_file (if it exists) matches.  This has the advantage of being
generic, not depending on details of git diff output formatting, and
how patch output relates to raw diff output, but it adds yet another
complication.

Note: both from_name and to_name can be quoted and contain spaces;
their separation is nontrivial.


Alternate solution, which we did chose, is to check when git splits
patches, and do not check if parsed info from current patch corresponds
to current or next raw diff format output line.  Git splits patches
only for 'T' (typechange) status filepair, and there always two patches
corresponding to one raw diff line.

And we can get rid of buffering extended diff header and parsing it,
This simplifies git_patchset_body, making it easier to understand and
maintain.  At the other hand it fixes gitweb to git diff output details.


While at it we added 'status_str' to diffinfo output, which stores
status (also for merge commit) as a string.  This allows for easy
checking if there is given status among all for merge commit, e.g.
  $diffinfo->{'status_str'} =~ /T/

There are no users of from_ids_eq subroutine.

Noticed-by: Yann Dirson <ydirson@xxxxxxxxxx>
Diagnosed-by: Petr Baudis <pasky@xxxxxxx>
Signed-off-by: Jakub Narebski <jnareb@xxxxxxxxx>
---
This is post 1.5.3 resend of the patch
  Message-Id: <200708290208.08191.jnareb@xxxxxxxxx>
posted in the "[BUG] gitweb on repo.or.cz shows buggy commitdiff"
thread.

I'm a tiny little bit ambivalent about this patch: on one hand it
fixes the bug and simplifies git_patchset_body code making it easier
to maintain, on the other hand it ties patch handling code in gitweb
with the details of git-diff patch vs. raw output format.

 gitweb/gitweb.perl |   63 +++++++++++++++++++--------------------------------
 1 files changed, 24 insertions(+), 39 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index b2bae1b..8c1e02c 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1963,7 +1963,7 @@ sub parse_difftree_raw_line {
 		$res{'to_mode'} = $2;
 		$res{'from_id'} = $3;
 		$res{'to_id'} = $4;
-		$res{'status'} = $5;
+		$res{'status'} = $res{'status_str'} = $5;
 		$res{'similarity'} = $6;
 		if ($res{'status'} eq 'R' || $res{'status'} eq 'C') { # renamed or copied
 			($res{'from_file'}, $res{'to_file'}) = map { unquote($_) } split("\t", $7);
@@ -1979,6 +1979,7 @@ sub parse_difftree_raw_line {
 		$res{'to_mode'} = pop @{$res{'from_mode'}};
 		$res{'from_id'} = [ split(' ', $3) ];
 		$res{'to_id'} = pop @{$res{'from_id'}};
+		$res{'status_str'} = $4;
 		$res{'status'} = [ split('', $4) ];
 		$res{'to_file'} = unquote($5);
 	}
@@ -3112,6 +3113,7 @@ sub git_patchset_body {
 	my $patch_line;
 	my $diffinfo;
 	my (%from, %to);
+	my $patches_per_difftree_line = 1;
 
 	print "<div class=\"patchset\">\n";
 
@@ -3124,42 +3126,13 @@ sub git_patchset_body {
 
  PATCH:
 	while ($patch_line) {
-		my @diff_header;
-		my ($from_id, $to_id);
-
-		# git diff header
-		#assert($patch_line =~ m/^diff /) if DEBUG;
-		#assert($patch_line !~ m!$/$!) if DEBUG; # is chomp-ed
-		$patch_number++;
-		push @diff_header, $patch_line;
-
-		# extended diff header
-	EXTENDED_HEADER:
-		while ($patch_line = <$fd>) {
-			chomp $patch_line;
-
-			last EXTENDED_HEADER if ($patch_line =~ m/^--- |^diff /);
-
-			if ($patch_line =~ m/^index ([0-9a-fA-F]{40})..([0-9a-fA-F]{40})/) {
-				$from_id = $1;
-				$to_id   = $2;
-			} elsif ($patch_line =~ m/^index ((?:[0-9a-fA-F]{40},)+[0-9a-fA-F]{40})..([0-9a-fA-F]{40})/) {
-				$from_id = [ split(',', $1) ];
-				$to_id   = $2;
-			}
-
-			push @diff_header, $patch_line;
-		}
-		my $last_patch_line = $patch_line;
 
 		# check if current patch belong to current raw line
 		# and parse raw git-diff line if needed
-		if (defined $diffinfo &&
-		    defined $from_id && defined $to_id &&
-		    from_ids_eq($diffinfo->{'from_id'}, $from_id) &&
-		    $diffinfo->{'to_id'} eq $to_id) {
+		if ($patches_per_difftree_line > 1) {
 			# this is continuation of a split patch
 			print "<div class=\"patch cont\">\n";
+			$patches_per_difftree_line--;
 		} else {
 			# advance raw git-diff output if needed
 			$patch_idx++ if defined $diffinfo;
@@ -3167,7 +3140,7 @@ sub git_patchset_body {
 			# compact combined diff output can have some patches skipped
 			# find which patch (using pathname of result) we are at now
 			my $to_name;
-			if ($diff_header[0] =~ m!^diff --cc "?(.*)"?$!) {
+			if ($patch_line =~ m!^diff --cc "?(.*)"?$!) {
 				$to_name = $1;
 			}
 
@@ -3191,6 +3164,7 @@ sub git_patchset_body {
 				}
 			} until (!defined $to_name || $to_name eq $diffinfo->{'to_file'} ||
 			         $patch_idx > $#$difftree);
+
 			# modifies %from, %to hashes
 			parse_from_to_diffinfo($diffinfo, \%from, \%to, @hash_parents);
 			if ($diffinfo->{'nparents'}) {
@@ -3232,32 +3206,43 @@ sub git_patchset_body {
 			# this is first patch for raw difftree line with $patch_idx index
 			# we index @$difftree array from 0, but number patches from 1
 			print "<div class=\"patch\" id=\"patch". ($patch_idx+1) ."\">\n";
+
+			# typechange diff with 'T' status has two patches per one raw line
+			if ($diffinfo->{'status_str'} =~ /T/) {
+				$patches_per_difftree_line = 2;
+			}
 		}
 
+		# git diff header
+		#assert($patch_line =~ m/^diff /) if DEBUG;
+		#assert($patch_line !~ m!$/$!) if DEBUG; # is chomp-ed
+		$patch_number++;
 		# print "git diff" header
-		$patch_line = shift @diff_header;
 		print format_git_diff_header_line($patch_line, $diffinfo,
 		                                  \%from, \%to);
 
 		# print extended diff header
-		print "<div class=\"diff extended_header\">\n" if (@diff_header > 0);
+		print "<div class=\"diff extended_header\">\n";
 	EXTENDED_HEADER:
-		foreach $patch_line (@diff_header) {
+		while ($patch_line = <$fd>) {
+			chomp $patch_line;
+
+			last EXTENDED_HEADER if ($patch_line =~ m/^--- |^diff /);
+
 			print format_extended_diff_header_line($patch_line, $diffinfo,
 			                                       \%from, \%to);
 		}
-		print "</div>\n"  if (@diff_header > 0); # class="diff extended_header"
+		print "</div>\n"; # class="diff extended_header"
 
 		# from-file/to-file diff header
-		$patch_line = $last_patch_line;
 		if (! $patch_line) {
 			print "</div>\n"; # class="patch"
 			last PATCH;
 		}
 		next PATCH if ($patch_line =~ m/^diff /);
 		#assert($patch_line =~ m/^---/) if DEBUG;
-		#assert($patch_line eq $last_patch_line) if DEBUG;
 
+		my $last_patch_line = $patch_line;
 		$patch_line = <$fd>;
 		chomp $patch_line;
 		#assert($patch_line =~ m/^\+\+\+/) if DEBUG;
-- 
1.5.2.5

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux