Re: fatal: cannot convert from utf8 to UTF-8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Cristian Tibirna <ctibirna@xxxxxxxxxxxxxxx> writes:

> This error:
>
> fatal: cannot convert from utf8 to UTF-8
> ...
> This is in part our fault: during the standardisation of our git environment, 
> we (re)enforced UTF-8 encodings by setting "i18n.commitenconding" and 
> "i18n.logOutputEncoding" to "utf8".
> ...
> I know "utf8" is not an accepted denomination ("UTF-8" or "utf-8" should be 
> used, according to IANA standards),...

Perhaps like this.

-- >8 --
Subject: [PATCH] reencode_string(): introduce and use same_encoding()

Callers of reencode_string() that re-encodes a string from one
encoding to another all used ad-hoc way to bypass the case where the
input and the output encodings are the same.  Some did strcmp(),
some did strcasecmp(), yet some others when converting to UTF-8 used
is_encoding_utf8().

Introduce same_encoding() helper function to make these callers
use the same logic.  Notably, is_encoding_utf8() has a work-around
for common misconfiguration to use "utf8" to name UTF-8 encoding,
which does not match "UTF-8" hence strcasecmp() would not consider
the same.  Make use of it in this helper function.

Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx>
---

 builtin/mailinfo.c | 2 +-
 notes.c            | 2 +-
 pretty.c           | 2 +-
 sequencer.c        | 2 +-
 utf8.c             | 7 +++++++
 utf8.h             | 1 +
 6 files changed, 12 insertions(+), 4 deletions(-)

diff --git c/builtin/mailinfo.c w/builtin/mailinfo.c
index da23140..e4e39d6 100644
--- c/builtin/mailinfo.c
+++ w/builtin/mailinfo.c
@@ -483,7 +483,7 @@ static void convert_to_utf8(struct strbuf *line, const char *charset)
 
 	if (!charset || !*charset)
 		return;
-	if (!strcasecmp(metainfo_charset, charset))
+	if (same_encoding(metainfo_charset, charset))
 		return;
 	out = reencode_string(line->buf, metainfo_charset, charset);
 	if (!out)
diff --git c/notes.c w/notes.c
index bc454e1..ee8f01f 100644
--- c/notes.c
+++ w/notes.c
@@ -1231,7 +1231,7 @@ static void format_note(struct notes_tree *t, const unsigned char *object_sha1,
 	}
 
 	if (output_encoding && *output_encoding &&
-			strcmp(utf8, output_encoding)) {
+	    !is_encoding_utf8(output_encoding)) {
 		char *reencoded = reencode_string(msg, output_encoding, utf8);
 		if (reencoded) {
 			free(msg);
diff --git c/pretty.c w/pretty.c
index 8b1ea9f..e87fe9f 100644
--- c/pretty.c
+++ w/pretty.c
@@ -504,7 +504,7 @@ char *logmsg_reencode(const struct commit *commit,
 		return NULL;
 	encoding = get_header(commit, "encoding");
 	use_encoding = encoding ? encoding : utf8;
-	if (!strcmp(use_encoding, output_encoding))
+	if (same_encoding(use_encoding, output_encoding))
 		if (encoding) /* we'll strip encoding header later */
 			out = xstrdup(commit->buffer);
 		else
diff --git c/sequencer.c w/sequencer.c
index e3723d2..73c396b 100644
--- c/sequencer.c
+++ w/sequencer.c
@@ -60,7 +60,7 @@ static int get_message(struct commit *commit, struct commit_message *out)
 
 	out->reencoded_message = NULL;
 	out->message = commit->buffer;
-	if (strcmp(encoding, git_commit_encoding))
+	if (same_encoding(encoding, git_commit_encoding))
 		out->reencoded_message = reencode_string(commit->buffer,
 					git_commit_encoding, encoding);
 	if (out->reencoded_message)
diff --git c/utf8.c w/utf8.c
index a544f15..6a52834 100644
--- c/utf8.c
+++ w/utf8.c
@@ -423,6 +423,13 @@ int is_encoding_utf8(const char *name)
 	return 0;
 }
 
+int same_encoding(const char *src, const char *dst)
+{
+	if (is_encoding_utf8(src) && is_encoding_utf8(dst))
+		return 1;
+	return !strcasecmp(src, dst);
+}
+
 /*
  * Given a buffer and its encoding, return it re-encoded
  * with iconv.  If the conversion fails, returns NULL.
diff --git c/utf8.h w/utf8.h
index 3c0ae76..93ef600 100644
--- c/utf8.h
+++ w/utf8.h
@@ -7,6 +7,7 @@ int utf8_width(const char **start, size_t *remainder_p);
 int utf8_strwidth(const char *string);
 int is_utf8(const char *text);
 int is_encoding_utf8(const char *name);
+int same_encoding(const char *, const char *);
 
 int strbuf_add_wrapped_text(struct strbuf *buf,
 		const char *text, int indent, int indent2, int width);
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]