[PATCH/RFC 1/x] gitweb: Use git-diff-tree patch output for commitdiff

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



As the subject indicates (RFC), I'd like some comments on that patch.


This patch is the first patch in series which tries to remove external
diff dependency in gitweb, and the need for temporary files.

Converting "blobdiff" and "blobdiff_plain" format would be much easier
if git-diff-tree and friends had -L <label>/--label=<label> option,
like GNU diff has.

Current patch preserves much of current output; the question is if for
example generate if 'plain' format should generate patch which could
be appplied by ordinary patch which do not understand git diff
extensions (including renaming and copying), as it is done in current
version, and if 'html' version should detect renames and copying.

Another question is if to have (as it is currently done) 'html' and
'plain' format generated by one subroutine, even though they don't
share that much code, and how to divide this code (currently invoking
git-commit-diff is in one part, generating header and commit message
is in second part, generating patch/patchset is in third part).


Further patches are planned, including getting rid of
git_commitdiff_plain, changing format_diff_line to include incomplete
lines in output, and converting git_blobdiff and git_blobdiff_plain to
use git-diff-tree.

This patch is based on 'next' (fbf19dd41bb51d5221fac739c5bdb48fd9012412)


P.S. Perhaps the below separator should be made standard and
recognized by git tools?
 
-- >8 --
Get rid of git_diff_print invocation in git_commitdiff and therefore
external diff (/usr/bin/diff) invocation, and use only git-diff-tree
to generate patch.

git_commitdiff and git_commitdiff_plain are collapsed into one
subroutine git_commitdiff, with format (currently 'html' which is
default format corresponding to git_commitdiff, and 'plain'
corresponding to git_commitdiff_plain) specified in argument.

Separate patch (diff) pretty-printing into git_patchset_body.
It is used in git_commitdiff.

Separate patch (diff) line formatting from git_diff_print into
format_diff_line function. It is used in git_patchset_body.

While at it, add $hash parameter to git_difftree_body, according to
rule that inner functions should use parameter passing, and not global
variables.

CHANGES TO OUTPUT:
 * "commitdiff" now products patches with renaming and copying
   detection (git-diff-tree is invoked with -M and -C options).
   Empty patches (mode changes and pure renames and copying)
   are not written currently. Former version broke renaming and
   copying, and didn't notice mode changes, like this version.

 * "commitdiff" output is now divided into several div elements
   of class "log", "patchset" and "patch".

 * "commitdiff_plain" now only generates X-Git-Tag: line only if there
   is tag pointing to the current commit. Former version which wrote
   first tag following current commit was broken[*1*]; besides we are
   interested rather in tags _preceding_ the commit, and _heads_
   following the commit. X-Git-Url: now is current URL; former version
   tried[*2*] to output URL to HTML version of commitdiff.

 * "commitdiff_plain" is generated by git-diff-tree, and has therefore
   has git specific extensions to diff format: "git diff" header and
   optional extended header lines.

FOOTNOTES
[*1*] First it generated rev-list starting from HEAD even if hash_base
parameter was set, second it wasn't corrected according to changes
made in git_get_references (formerly read_info_ref) output, third even
for older version of read_info_ref output it didn't work for multiple
tags pointing to the current commit (rare).

[*2*] It wrote URL for commitdiff without hash_parent, which produces
diff to first parent and is not the same as current diff if it is diff
of merge commit to non-first parent.

Signed-off-by: Jakub Narebski <jnareb@xxxxxxxxx>
---
 gitweb/gitweb.perl |  323 ++++++++++++++++++++++++++++++++--------------------
 1 files changed, 201 insertions(+), 122 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 50083e3..9367685 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -524,6 +524,27 @@ sub format_subject_html {
 	}
 }
 
+sub format_diff_line {
+	my $line = shift;
+	my $char = substr($line, 0, 1);
+	my $diff_class = "";
+
+	chomp $line;
+
+	if ($char eq '+') {
+		$diff_class = " add";
+	} elsif ($char eq "-") {
+		$diff_class = " rem";
+	} elsif ($char eq "@") {
+		$diff_class = " chunk_header";
+	} elsif ($char eq "\\") {
+		# skip errors (incomplete lines)
+		return "";
+	}
+	$line = untabify($line);
+	return "<div class=\"diff$diff_class\">" . esc_html($line) . "</div>\n";
+}
+
 ## ----------------------------------------------------------------------
 ## git utility subroutines, invoking git commands
 
@@ -1367,7 +1388,7 @@ ## .....................................
 ## functions printing large fragments of HTML
 
 sub git_difftree_body {
-	my ($difftree, $parent) = @_;
+	my ($difftree, $hash, $parent) = @_;
 
 	print "<div class=\"list_head\">\n";
 	if ($#{$difftree} > 10) {
@@ -1518,6 +1539,101 @@ sub git_difftree_body {
 	print "</table>\n";
 }
 
+sub git_patchset_body {
+	my ($patchset, $difftree, $hash, $hash_parent) = @_;
+
+	my $patch_idx = 0;
+	my $in_header = 0;
+	my $patch_found = 0;
+	my %diffinfo;
+
+	print "<div class=\"patchset\">\n";
+
+	LINE: foreach my $patch_line (@$patchset) {
+
+		if ($patch_line =~ m/^diff /) { # "git diff" header
+			# beginning of patch (in patchset)
+			if ($patch_found) {
+				# close previous patch
+				print "</div>\n"; # class="patch"
+			} else {
+				# first patch in patchset
+				$patch_found = 1;
+			}
+			print "<div class=\"patch\">\n";
+
+			%diffinfo = parse_difftree_raw_line($difftree->[$patch_idx++]);
+
+			# for now, no extended header, hence we skip empty patches
+			# companion to	next LINE if $in_header;
+			if ($diffinfo{'from_id'} eq $diffinfo{'to_id'}) { # no change
+				$in_header = 1;
+				next LINE;
+			}
+
+			if ($diffinfo{'status'} eq "A") { # added
+				print "<div class=\"diff_info\">" . file_type($diffinfo{'to_mode'}) . ":" .
+				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
+				                             hash=>$diffinfo{'to_id'}, file_name=>$diffinfo{'file'})},
+				              $diffinfo{'to_id'}) . "(new)" .
+				      "</div>\n"; # class="diff_info"
+
+			} elsif ($diffinfo{'status'} eq "D") { # deleted
+				print "<div class=\"diff_info\">" . file_type($diffinfo{'from_mode'}) . ":" .
+				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash_parent,
+				                             hash=>$diffinfo{'from_id'}, file_name=>$diffinfo{'file'})},
+				              $diffinfo{'from_id'}) . "(deleted)" .
+				      "</div>\n"; # class="diff_info"
+
+			} elsif ($diffinfo{'status'} eq "R" || # renamed
+			         $diffinfo{'status'} eq "C") { # copied
+				print "<div class=\"diff_info\">" .
+				      file_type($diffinfo{'from_mode'}) . ":" .
+				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash_parent,
+				                             hash=>$diffinfo{'from_id'}, file_name=>$diffinfo{'from_file'})},
+				              $diffinfo{'from_id'}) .
+				      " -> " .
+				      file_type($diffinfo{'to_mode'}) . ":" .
+				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
+				                             hash=>$diffinfo{'to_id'}, file_name=>$diffinfo{'to_file'})},
+				              $diffinfo{'to_id'});
+				print "</div>\n"; # class="diff_info"
+
+			} else { # modified, mode changed, ...
+				print "<div class=\"diff_info\">" .
+				      file_type($diffinfo{'from_mode'}) . ":" .
+				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash_parent,
+				                             hash=>$diffinfo{'from_id'}, file_name=>$diffinfo{'file'})},
+				              $diffinfo{'from_id'}) .
+				      " -> " .
+				      file_type($diffinfo{'to_mode'}) . ":" .
+				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
+				                             hash=>$diffinfo{'to_id'}, file_name=>$diffinfo{'file'})},
+				              $diffinfo{'to_id'});
+				print "</div>\n"; # class="diff_info"
+			}
+
+			#print "<div class=\"diff extended_header\">\n";
+			$in_header = 1;
+			next LINE;
+		} # start of patch in patchset
+
+
+		if ($in_header && $patch_line =~ m/^---/) {
+			#print "</div>\n"
+			$in_header = 0;
+		}
+		next LINE if $in_header;
+
+		print format_diff_line($patch_line);
+	}
+	print "</div>\n" if $patch_found; # class="patch"
+
+	print "</div>\n"; # class="patchset"
+}
+
+# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
+
 sub git_shortlog_body {
 	# uses global variable $project
 	my ($revlist, $from, $to, $refs, $extra) = @_;
@@ -2556,7 +2672,7 @@ sub git_commit {
 	git_print_log($co{'comment'});
 	print "</div>\n";
 
-	git_difftree_body(\@difftree, $parent);
+	git_difftree_body(\@difftree, $hash, $parent);
 
 	git_footer_html();
 }
@@ -2600,7 +2716,7 @@ sub git_blobdiff_plain {
 }
 
 sub git_commitdiff {
-	mkdir($git_temp, 0700);
+	my $format = shift || 'html';
 	my %co = parse_commit($hash);
 	if (!%co) {
 		die_error(undef, "Unknown commit object");
@@ -2608,143 +2724,106 @@ sub git_commitdiff {
 	if (!defined $hash_parent) {
 		$hash_parent = $co{'parent'} || '--root';
 	}
-	open my $fd, "-|", $GIT, "diff-tree", '-r', $hash_parent, $hash
-		or die_error(undef, "Open git-diff-tree failed");
-	my @difftree = map { chomp; $_ } <$fd>;
-	close $fd or die_error(undef, "Reading git-diff-tree failed");
+
+	# read commitdiff
+	my $fd;
+	my @difftree;
+	my @patchset;
+	if ($format eq 'html') {
+		open $fd, "-|", $GIT, "diff-tree", '-r', '-M', '-C',
+			"--patch-with-raw", "--full-index", $hash_parent, $hash
+			or die_error(undef, "Open git-diff-tree failed");
+
+		while (chomp(my $line = <$fd>)) {
+			# empty line ends raw part of diff-tree output
+			last unless $line;
+			push @difftree, $line;
+		}
+		@patchset = map { chomp; $_ } <$fd>;
+
+		close $fd
+			or die_error(undef, "Reading git-diff-tree failed");
+	} elsif ($format eq 'plain') {
+		open $fd, "-|", $GIT, "diff-tree", '-r', '-p', '-B', $hash_parent, $hash
+			or die_error(undef, "Open git-diff-tree failed");
+	} else {
+		die_error(undef, "Unknown commitdiff format");
+	}
 
 	# non-textual hash id's can be cached
 	my $expires;
 	if ($hash =~ m/^[0-9a-fA-F]{40}$/) {
 		$expires = "+1d";
 	}
-	my $refs = git_get_references();
-	my $ref = format_ref_marker($refs, $co{'id'});
-	my $formats_nav =
-		$cgi->a({-href => href(action=>"commitdiff_plain", hash=>$hash, hash_parent=>$hash_parent)},
-		        "plain");
-	git_header_html(undef, $expires);
-	git_print_page_nav('commitdiff','', $hash,$co{'tree'},$hash, $formats_nav);
-	git_print_header_div('commit', esc_html($co{'title'}) . $ref, $hash);
-	print "<div class=\"page_body\">\n";
-	git_print_simplified_log($co{'comment'}, 1); # skip title
-	print "<br/>\n";
-	foreach my $line (@difftree) {
-		# ':100644 100644 03b218260e99b78c6df0ed378e59ed9205ccc96d 3b93d5e7cc7f7dd4ebed13a5cc1a4ad976fc94d8 M      ls-files.c'
-		# ':100644 100644 7f9281985086971d3877aca27704f2aaf9c448ce bc190ebc71bbd923f2b728e505408f5e54bd073a M      rev-tree.c'
-		if ($line !~ m/^:([0-7]{6}) ([0-7]{6}) ([0-9a-fA-F]{40}) ([0-9a-fA-F]{40}) (.)\t(.*)$/) {
-			next;
-		}
-		my $from_mode = $1;
-		my $to_mode = $2;
-		my $from_id = $3;
-		my $to_id = $4;
-		my $status = $5;
-		my $file = validate_input(unquote($6));
-		if ($status eq "A") {
-			print "<div class=\"diff_info\">" . file_type($to_mode) . ":" .
-			      $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
-			                             hash=>$to_id, file_name=>$file)},
-			              $to_id) . "(new)" .
-			      "</div>\n";
-			git_diff_print(undef, "/dev/null", $to_id, "b/$file");
-		} elsif ($status eq "D") {
-			print "<div class=\"diff_info\">" . file_type($from_mode) . ":" .
-			      $cgi->a({-href => href(action=>"blob", hash_base=>$hash_parent,
-			                             hash=>$from_id, file_name=>$file)},
-			              $from_id) . "(deleted)" .
-			      "</div>\n";
-			git_diff_print($from_id, "a/$file", undef, "/dev/null");
-		} elsif ($status eq "M") {
-			if ($from_id ne $to_id) {
-				print "<div class=\"diff_info\">" .
-				      file_type($from_mode) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash_parent,
-				                             hash=>$from_id, file_name=>$file)},
-				              $from_id) .
-				      " -> " .
-				      file_type($to_mode) . ":" .
-				      $cgi->a({-href => href(action=>"blob", hash_base=>$hash,
-				                             hash=>$to_id, file_name=>$file)},
-				              $to_id);
-				print "</div>\n";
-				git_diff_print($from_id, "a/$file",  $to_id, "b/$file");
-			}
-		}
-	}
-	print "<br/>\n" .
-	      "</div>";
-	git_footer_html();
-}
 
-sub git_commitdiff_plain {
-	mkdir($git_temp, 0700);
-	my %co = parse_commit($hash);
-	if (!%co) {
-		die_error(undef, "Unknown commit object");
-	}
-	if (!defined $hash_parent) {
-		$hash_parent = $co{'parent'} || '--root';
-	}
-	open my $fd, "-|", $GIT, "diff-tree", '-r', $hash_parent, $hash
-		or die_error(undef, "Open git-diff-tree failed");
-	my @difftree = map { chomp; $_ } <$fd>;
-	close $fd or die_error(undef, "Reading diff-tree failed");
-
-	# try to figure out the next tag after this commit
-	my $tagname;
-	my $refs = git_get_references("tags");
-	open $fd, "-|", $GIT, "rev-list", "HEAD";
-	my @commits = map { chomp; $_ } <$fd>;
-	close $fd;
-	foreach my $commit (@commits) {
-		if (defined $refs->{$commit}) {
-			$tagname = $refs->{$commit}
-		}
-		if ($commit eq $hash) {
-			last;
-		}
-	}
+	# write commit message
+	if ($format eq 'html') {
+		my $refs = git_get_references();
+		my $ref = format_ref_marker($refs, $co{'id'});
+		my $formats_nav =
+			$cgi->a({-href => href(action=>"commitdiff_plain",
+			                       hash=>$hash, hash_parent=>$hash_parent)},
+			        "plain");
 
-	print $cgi->header(-type => "text/plain",
-	                   -charset => 'utf-8',
-	                   -content_disposition => "inline; filename=\"git-$hash.patch\"");
-	my %ad = parse_date($co{'author_epoch'}, $co{'author_tz'});
-	my $comment = $co{'comment'};
-	print <<TEXT;
+		git_header_html(undef, $expires);
+		git_print_page_nav('commitdiff','', $hash,$co{'tree'},$hash, $formats_nav);
+		git_print_header_div('commit', esc_html($co{'title'}) . $ref, $hash);
+		print "<div class=\"page_body\">\n";
+		print "<div class=\"log\">\n";
+		git_print_simplified_log($co{'comment'}, 1); # skip title
+		print "</div>\n"; # class="log"
+
+	} elsif ($format eq 'plain') {
+		my $refs = git_get_references("tags");
+		my @tagnames;
+		if (exists $refs->{$hash}) {
+			@tagnames = map { s|^tags/|| } $refs->{$hash};
+		}
+		my $filename = basename($project) . "-$hash.patch";
+
+		print $cgi->header(
+			-type => 'text/plain',
+			-charset => 'utf-8',
+			-expires => $expires,
+			-content_disposition => qq(inline; filename="$filename"));
+		my %ad = parse_date($co{'author_epoch'}, $co{'author_tz'});
+		print <<TEXT;
 From: $co{'author'}
 Date: $ad{'rfc2822'} ($ad{'tz_local'})
 Subject: $co{'title'}
 TEXT
-	if (defined $tagname) {
-		print "X-Git-Tag: $tagname\n";
+		foreach my $tag (@tagnames) {
+			print "X-Git-Tag: $tag\n";
+		}
+		print "X-Git-Url: " . $cgi->self_url() . "\n\n";
+		foreach my $line (@{$co{'comment'}}) {
+			print "$line\n";
+		}
+		print "---\n\n";
 	}
-	print "X-Git-Url: $my_url?p=$project;a=commitdiff;h=$hash\n" .
-	      "\n";
 
-	foreach my $line (@$comment) {;
-		print "$line\n";
-	}
-	print "---\n\n";
+	# write patch
+	if ($format eq 'html') {
+		#git_difftree_body(\@difftree, $hash, $hash_parent);
+		#print "<br/>\n";
 
-	foreach my $line (@difftree) {
-		if ($line !~ m/^:([0-7]{6}) ([0-7]{6}) ([0-9a-fA-F]{40}) ([0-9a-fA-F]{40}) (.)\t(.*)$/) {
-			next;
-		}
-		my $from_id = $3;
-		my $to_id = $4;
-		my $status = $5;
-		my $file = $6;
-		if ($status eq "A") {
-			git_diff_print(undef, "/dev/null", $to_id, "b/$file", "plain");
-		} elsif ($status eq "D") {
-			git_diff_print($from_id, "a/$file", undef, "/dev/null", "plain");
-		} elsif ($status eq "M") {
-			git_diff_print($from_id, "a/$file",  $to_id, "b/$file", "plain");
-		}
+		git_patchset_body(\@patchset, \@difftree, $hash, $hash_parent);
+
+		print "</div>\n"; # class="page_body"
+		git_footer_html();
+
+	} elsif ($format eq 'plain') {
+		local $/ = undef;
+		print <$fd>;
+		close $fd
+			or print "Reading git-diff-tree failed\n";
 	}
 }
 
+sub git_commitdiff_plain {
+	git_commitdiff('plain');
+}
+
 sub git_history {
 	if (!defined $hash_base) {
 		$hash_base = git_get_head_hash($project);
-- 
1.4.1.1

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]