[PATCH v2 2/4] utf8-bom: introduce skip_utf8_bom() helper

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



With the recent change to ignore the UTF8 BOM at the beginning of
.gitignore files, we now have two codepaths that do such a skipping
(the other one is for reading the configuration files).

Introduce utf8_bom[] constant string and skip_utf8_bom() helper
and teach .gitignore code how to use it.

Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx>
---
 dir.c  |  9 ++++-----
 utf8.c | 11 +++++++++++
 utf8.h |  3 +++
 3 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/dir.c b/dir.c
index b5bb389..4c4bf91 100644
--- a/dir.c
+++ b/dir.c
@@ -12,6 +12,7 @@
 #include "refs.h"
 #include "wildmatch.h"
 #include "pathspec.h"
+#include "utf8.h"
 
 struct path_simplify {
 	int len;
@@ -538,7 +539,6 @@ int add_excludes_from_file_to_list(const char *fname,
 	struct stat st;
 	int fd, i, lineno = 1;
 	size_t size = 0;
-	static const unsigned char *utf8_bom = (unsigned char *) "\xef\xbb\xbf";
 	char *buf, *entry;
 
 	fd = open(fname, O_RDONLY);
@@ -576,10 +576,9 @@ int add_excludes_from_file_to_list(const char *fname,
 
 	el->filebuf = buf;
 
-	if (size >= 3 && !memcmp(buf, utf8_bom, 3)) {
-		buf += 3;
-		size -= 3;
-	}
+	if (skip_utf8_bom(&buf, size))
+		size -= buf - el->filebuf;
+
 	entry = buf;
 
 	for (i = 0; i < size; i++) {
diff --git a/utf8.c b/utf8.c
index 520fbb4..28e6d76 100644
--- a/utf8.c
+++ b/utf8.c
@@ -633,3 +633,14 @@ int is_hfs_dotgit(const char *path)
 
 	return 1;
 }
+
+const char utf8_bom[] = "\357\273\277";
+
+int skip_utf8_bom(char **text, size_t len)
+{
+	if (len < strlen(utf8_bom) ||
+	    memcmp(*text, utf8_bom, strlen(utf8_bom)))
+		return 0;
+	*text += strlen(utf8_bom);
+	return 1;
+}
diff --git a/utf8.h b/utf8.h
index e4d9183..e7b2aa4 100644
--- a/utf8.h
+++ b/utf8.h
@@ -13,6 +13,9 @@ int same_encoding(const char *, const char *);
 __attribute__((format (printf, 2, 3)))
 int utf8_fprintf(FILE *, const char *, ...);
 
+extern const char utf8_bom[];
+extern int skip_utf8_bom(char **, size_t);
+
 void strbuf_add_wrapped_text(struct strbuf *buf,
 		const char *text, int indent, int indent2, int width);
 void strbuf_add_wrapped_bytes(struct strbuf *buf, const char *data, int len,
-- 
2.4.0-rc2-171-g98ddf7f

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]