[PATCH/RFC v2] ll-merge: Normalize files before merging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Currently, merging across changes in line ending normalization is
painful since all lines containing CRLF will conflict.

Fix ll-merge so that the "base", "theirs" and "ours" files are passed
through convert_to_worktree() and convert_to_git() before a three-way
merge.

This prevents differences that can be normalized away from blocking an
automatic merge, and makes real conflicts show up instead of being lost
in the noise.

Signed-off-by: Eyvind Bernhardsen <eyvind.bernhardsen@xxxxxxxxx>
---
Second stab at having ll-merge automatically handle merges across text
conversion boundaries.  The previous implementation ran convert_to_git()
on already normalized data, which is potentially dangerous.  This
version runs convert_to_worktree() first.

As promised, I did some benchmarks this time.  My repository has 55617
files, all normalized with "* text=auto".  The branch to be merged
changes 16157 of them and is based on an old, pre-normalization commit.

An uncompleted merge doesn't make for a fair performance comparison, so
I ended up rebasing the topic branch on top of the first normalized
commit on master to compare performance.  I ran each merge 10 times and
averaged the user time for each run.

With the simplified merge there was no measurable difference between
running with or without normalization.  Adding core.eol=crlf when
normalizing cost about 5%.

A remaining problem is that delete/modify conflicts where the "modify"
is just normalization have to be resolved manually.  This could be fixed
by normalizing files when there is a d/m conflict and then comparing the
normalized sha1s.

- Eyvind

 ll-merge.c                 |   17 +++++++++++++
 t/t6038-merge-text-auto.sh |   54 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 71 insertions(+), 0 deletions(-)
 create mode 100755 t/t6038-merge-text-auto.sh

diff --git a/ll-merge.c b/ll-merge.c
index f9b3d85..264337e 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -321,6 +321,20 @@ static int git_path_check_merge(const char *path, struct git_attr_check check[2]
 	return git_checkattr(path, 2, check);
 }
 
+static void normalize_file(mmfile_t *mm, const char *path) {
+	struct strbuf strbuf = STRBUF_INIT;
+	if (convert_to_working_tree(path, mm->ptr, mm->size, &strbuf)) {
+		free(mm->ptr);
+		mm->size = strbuf.len;
+		mm->ptr = strbuf_detach(&strbuf, NULL);
+	}
+	if (convert_to_git(path, mm->ptr, mm->size, &strbuf, 0)) {
+		free(mm->ptr);
+		mm->size = strbuf.len;
+		mm->ptr = strbuf_detach(&strbuf, NULL);
+	}
+}
+
 int ll_merge(mmbuffer_t *result_buf,
 	     const char *path,
 	     mmfile_t *ancestor, const char *ancestor_label,
@@ -334,6 +348,9 @@ int ll_merge(mmbuffer_t *result_buf,
 	const struct ll_merge_driver *driver;
 	int virtual_ancestor = flag & 01;
 
+	normalize_file(ancestor, path);
+	normalize_file(ours, path);
+	normalize_file(theirs, path);
 	if (!git_path_check_merge(path, check)) {
 		ll_driver_name = check[0].value;
 		if (check[1].value) {
diff --git a/t/t6038-merge-text-auto.sh b/t/t6038-merge-text-auto.sh
new file mode 100755
index 0000000..6af2c41
--- /dev/null
+++ b/t/t6038-merge-text-auto.sh
@@ -0,0 +1,54 @@
+#!/bin/sh
+
+test_description='CRLF merge conflict across text=auto change'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+	git config core.autocrlf false &&
+	echo first line | append_cr >file &&
+	git add file &&
+	git commit -m "Initial" &&
+	git tag initial &&
+	git branch side &&
+	echo "* text=auto" >.gitattributes &&
+	git add .gitattributes &&
+	echo same line | append_cr >>file &&
+	git add file &&
+	git commit -m "add line from a" &&
+	git tag a &&
+	git rm .gitattributes &&
+	rm file &&
+	git checkout file &&
+	git commit -m "remove .gitattributes" &&
+	git tag c &&
+	git checkout side &&
+	echo same line | append_cr >>file &&
+	git commit -m "add line from b" file &&
+	git tag b &&
+	git checkout master
+'
+
+test_expect_success 'Check merging after setting text=auto' '
+	git reset --hard a &&
+	git merge b &&
+	cat file | remove_cr >file.temp &&
+	test_cmp file file.temp
+'
+
+test_expect_success 'Check merging addition of text=auto' '
+	git reset --hard b &&
+	git merge a &&
+	cat file | remove_cr >file.temp &&
+	test_cmp file file.temp
+'
+
+# Not sure if this deserves to be fixed
+test_expect_failure 'Check merging removal of text=auto' '
+	git reset --hard b &&
+	git merge c &&
+	cat file | remove_cr >file.temp &&
+	test_cmp file file.temp
+'
+
+test_done
-- 
1.7.1.5.g0ed10.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]