Re: Importing Mozilla CVS into git

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Sun, 4 Jun 2006, Robin Rosenberg (list subscriber) wrote:
> 
> (Yet) Another problem is that many windows tools use CR LF as the line ending.
> Almost all windows editors default to CRLF and some detect existing line 
> endings. No editing with notepad anymore. Of course that is a problem 
> regardless of whether a git or cvs client is used. You'll get these big 
> everything-changed commits that alter between CRLF and LF.

The only sane approach there (if you want to be at all cross-platform) is 
to just force everybody to _commit_ in UNIX '\n'-only format. Especially 
as most Windows tools probably handle that fine on reading (just have 
trouble writing them).

And that shouldn't actually be that hard to do. The most trivial approach 
is to have just a pre-trigger on commits, but let's face it, that would 
not be a good "full" solution. A better one is to just make the whole
"git update-index" thing just have a "automatically ignore CR/LF" mode.

Which really shouldn't be that hard. I think it's literally a matter of 
teaching "index_fd()" in sha1_file.c to recognize text-files, and remove 
CR/LF from them. All done (except to add the flag that enables the 
detection, of course - just so that sane systems won't have the overhead 
or the "corrupt binary files" issue).

Something like this is TOTALLY UNTESTED!

(You also need to teach "diff" to ignore differences in cr/lf, and this 
patch is bad because it's unconditional, and probably doesn't work 
anyway, but hey, the idea is possibly sound. Maybe)

		Linus
---
diff --git a/sha1_file.c b/sha1_file.c
index aea0f40..6dc6a3f 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1740,9 +1740,30 @@ int index_pipe(unsigned char *sha1, int 
 	return ret;
 }
 
+static unsigned long autodetect_crlf(unsigned char *src, unsigned long size)
+{
+	unsigned long newsize = 0;
+	unsigned char *dst = src;
+	unsigned char last = 0;
+
+	while (size) {
+		unsigned char c = *src++;
+		if (last == '\r' && c == '\n') {
+			dst[-1] = '\n';
+		} else {
+			newsize++;
+			dst++;
+			if (dst != src)
+				dst[-1] = c;
+		}
+		last = c;
+	}
+	return newsize;
+}
+
 int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, const char *type)
 {
-	unsigned long size = st->st_size;
+	unsigned long size = st->st_size, use_size;
 	void *buf;
 	int ret;
 	unsigned char hdr[50];
@@ -1755,12 +1776,15 @@ int index_fd(unsigned char *sha1, int fd
 	if (buf == MAP_FAILED)
 		return -1;
 
-	if (!type)
+	use_size = size;
+	if (!type) {
 		type = blob_type;
+		use_size = autodetect_crlf(buf, size);
+	}
 	if (write_object)
-		ret = write_sha1_file(buf, size, type, sha1);
+		ret = write_sha1_file(buf, use_size, type, sha1);
 	else {
-		write_sha1_file_prepare(buf, size, type, sha1, hdr, &hdrlen);
+		write_sha1_file_prepare(buf, use_size, type, sha1, hdr, &hdrlen);
 		ret = 0;
 	}
 	if (size)
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]