Re: [PATCH] contrib/diff-highlight: multibyte characters diff

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for reviewing.
as you wrote, diff content may not be utf8 at all. and we don't know that the user's terminal watns is utf8.
I think your trying utf8 decode and fall back approach is better than my patch, and do work well.

is using "$@" for catching error like the patch below?
According to perldoc Encode.pm, encode/decode with "FB_CROAK" may destroy original string. We should probabry use "LEAVE_SRC" on decode_utf8's second argument.

---
 contrib/diff-highlight/diff-highlight | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/contrib/diff-highlight/diff-highlight b/contrib/diff-highlight/diff-highlight
index c4404d4..0743851 100755
--- a/contrib/diff-highlight/diff-highlight
+++ b/contrib/diff-highlight/diff-highlight
@@ -2,6 +2,7 @@
 
 use warnings FATAL => 'all';
 use strict;
+use Encode qw(decode_utf8 encode_utf8);
 
 # Highlight by reversing foreground and background. You could do
 # other things like bold or underline if you prefer.
@@ -73,13 +74,23 @@ sub show_hunk {
 
 	my @queue;
 	for (my $i = 0; $i < @$a; $i++) {
-		my ($rm, $add) = highlight_pair($a->[$i], $b->[$i]);
-		print $rm;
-		push @queue, $add;
+		my ($a_dec, $encode_rm) = decode($a->[$i]);
+		my ($b_dec, $encode_add) = decode($b->[$i]);
+		my ($rm, $add) = highlight_pair($a_dec, $b_dec);
+		print $encode_rm->($rm);
+		push @queue, $encode_add->($add);
 	}
 	print @queue;
 }
 
+sub decode {
+	my $orig = shift;
+	my $decoded = eval { decode_utf8($orig, Encode::FB_CROAK | Encode::LEAVE_SRC) };
+	return $@ ?
+	       ($orig, sub { shift }) :
+	       ($decoded, sub { encode_utf8(shift) });
+}
+
 sub highlight_pair {
 	my @a = split_line(shift);
 	my @b = split_line(shift);
-- 
1.8.5.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]