git am --rebasing clobbers commit encoding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

commit 5e835cac "rebase: do not munge commit log message", aiming to fix
rebasing of commits with multiple lines in the first paragraph of the
commit message, has broken rebasing of commits which are not encoded in
i18n.commitEncoding.

Here's the patch from 5e835cac:

diff --git a/git-am.sh b/git-am.sh
index 245e1db..5a7695e 100755
--- a/git-am.sh
+++ b/git-am.sh
@@ -327,11 +327,20 @@ do
 			echo "Patch is empty.  Was it split wrong?"
 			stop_here $this
 		}
-		SUBJECT="$(sed -n '/^Subject/ s/Subject: //p' "$dotest/info")"
-		case "$keep_subject" in -k)  SUBJECT="[PATCH] $SUBJECT" ;; esac
-
-		(echo "$SUBJECT" ; echo ; cat "$dotest/msg") |
-			git stripspace > "$dotest/msg-clean"
+		if test -f "$dotest/rebasing" &&
+			commit=$(sed -e 's/^From \([0-9a-f]*\) .*/\1/' \
+				-e q "$dotest/$msgnum") &&
+			test "$(git cat-file -t "$commit")" = commit
+		then
+			git cat-file commit "$commit" |
+			sed -e '1,/^$/d' >"$dotest/msg-clean"
+		else
+			SUBJECT="$(sed -n '/^Subject/ s/Subject: //p' "$dotest/info")"
+			case "$keep_subject" in -k)  SUBJECT="[PATCH] $SUBJECT" ;; esac
+
+			(echo "$SUBJECT" ; echo ; cat "$dotest/msg") |
+				git stripspace > "$dotest/msg-clean"
+		fi
 		;;
 	esac
 
The problem is, that "git cat-file commit .. | sed -e '1,/^$/d'"
discards the commit encoding header.  The commit message is taken as-is,
but later committed with the current i18n.commitEncoding.

I'm working on several machines, some of which use the legacy latin1
encoding; if I push a latin1-encoded commit to a machine using utf-8,
and rebase it there, the commits messages are corrupted.

This is particularly unlucky since with correct commit messages (no
multiple lines in first paragraph), "$dotest/msg" is perfectly fine, and
the above patch is not needed anyway.

Unfortunately, I don't see an easy way to fix this.  It seems to me that
recoding commit messages isn't really supported: commits are _always_
done with the configured i18n.commitencoding, and the only place where
recoding takes place is processing of patches sent by email, where
git-mailinfo recodes the commit message.  And commit 5e835cac has
bypassed git-mailinfo in git-am for rebasing.

I guess we cannot call iconv(1) after git cat-file because that's not
portable, and there is no plumbing command that can be used from a shell
script to convert the encoding of a commit; the functionality is in
utf8.c:reencode_string(), so only C code can use it.

Nikolaus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux