Re: [PATCH-v2/RFC 3/6] xutils: fix ignore-all-space on incomplete line

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thell Fowler <git@xxxxxxxxxxxxx> writes:

> @@ -191,12 +191,14 @@ int xdl_recmatch(const char *l1, long s1, const char *l2, long s2, long flags)
>  	int i1, i2;
>  
>  	if (flags & XDF_IGNORE_WHITESPACE) {
> -		for (i1 = i2 = 0; i1 < s1 && i2 < s2; ) {
> +		for (i1 = i2 = 0; i1 < s1 || i2 < s2; ) {
>  			if (isspace(l1[i1]))
> -				while (isspace(l1[i1]) && i1 < s1)
> +				while ((isspace(l1[i1]) && i1 < s1)
> +						|| (i1 + 1 == s1 && l1[s1] != '\n'))

This is wrong.  If you ran out l1/s1/i1 but you still have remaining
characters in l2/s2/i2, you do not want to even look at l1[i1].

You can fudge this by sprinkling more "(i1 < s1) &&" in many places (and
reordering how your inner while() loop checks (i1 < s1) and l1[i1]), but I
do not think that is the right direction.

The thing is, the loop control in this function is extremely hard to read
to begin with, and now it is "if we haven't run out both", the complexity
seeps into the inner logic.

How about doing it like this patch instead?  This counterproposal replaces
your 3 patches starting from [3/6].

-- >8 --
Subject: xutils: Fix xdl_recmatch() on incomplete lines

Thell Fowler noticed that various "ignore whitespace" options to
git diff does not work well with whitespace glitches on an incomplete
line.

The loop control of this function incorrectly handled incomplete lines,
and it was extremely difficult to follow.  This restructures the loops for
three variants of "ignore whitespace" logic.

The basic idea of the re-written logic is this.

 - An initial loop runs while the characters from both strings we are
   looking at match.  We declare unmatch immediately when we find
   something that does not match and return false from the loop.  And we
   break out of the loop if we ran out of either side of the string.

   The way we skip spaces inside this loop varies depending on the style
   of ignoring whitespaces.

 - After the loop, the lines can match only if the remainder consists of
   nothing but whitespaces.  This part of the logic is shared across all
   three styles.

The new code is more obvious and should be much easier to follow.

Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx>

---
 xdiff/xutils.c |  111 +++++++++++++++++++++++++++++++++++++++-----------------
 1 files changed, 77 insertions(+), 34 deletions(-)

diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 9411fa9..dd8b7e7 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -186,50 +186,93 @@ long xdl_guess_lines(mmfile_t *mf) {
 	return nl + 1;
 }
 
+static int remainder_all_ws(const char *l1, const char *l2,
+			    int i1, int i2, long s1, long s2)
+{
+	if (i1 < s1) {
+		while (i1 < s1 && isspace(l1[i1]))
+			i1++;
+		return (s1 == i1);
+	}
+	if (i2 < s2) {
+		while (i2 < s2 && isspace(l2[i2]))
+			i2++;
+		return (s2 == i2);
+	}
+	return 1;
+}
+
 int xdl_recmatch(const char *l1, long s1, const char *l2, long s2, long flags)
 {
-	int i1, i2;
+	int i1 = 0, i2 = 0;
 
 	if (flags & XDF_IGNORE_WHITESPACE) {
-		for (i1 = i2 = 0; i1 < s1 && i2 < s2; ) {
-			if (isspace(l1[i1]))
-				while (isspace(l1[i1]) && i1 < s1)
-					i1++;
-			if (isspace(l2[i2]))
-				while (isspace(l2[i2]) && i2 < s2)
-					i2++;
-			if (i1 < s1 && i2 < s2 && l1[i1++] != l2[i2++])
-				return 0;
+		while (1) {
+			while (i1 < s1 && isspace(l1[i1]))
+				i1++;
+			while (i2 < s2 && isspace(l2[i2]))
+				i2++;
+			if (i1 < s1 && i2 < s2) {
+				if (l1[i1++] != l2[i2++])
+					return 0;
+				continue;
+			}
+			break;
 		}
-		return (i1 >= s1 && i2 >= s2);
+
+		/*
+		 * we ran out one side; the remaining side must be all
+		 * whitespace to match.
+		 */
+		return remainder_all_ws(l1, l2, i1, i2, s1, s2);
 	} else if (flags & XDF_IGNORE_WHITESPACE_CHANGE) {
-		for (i1 = i2 = 0; i1 < s1 && i2 < s2; ) {
-			if (isspace(l1[i1])) {
-				if (!isspace(l2[i2]))
+		while (1) {
+			if (i1 < s1 && i2 < s2) {
+				/* Skip matching spaces */
+				if (isspace(l1[i1]) && isspace(l2[i2])) {
+					while (i1 < s1 && isspace(l1[i1]))
+						i1++;
+					while (i2 < s2 && isspace(l2[i2]))
+						i2++;
+				}
+			}
+			if (i1 < s1 && i2 < s2) {
+				/*
+				 * We still have both sides; do they match?
+				 */
+				if (l1[i1++] != l2[i2++])
 					return 0;
-				while (isspace(l1[i1]) && i1 < s1)
-					i1++;
-				while (isspace(l2[i2]) && i2 < s2)
-					i2++;
-			} else if (l1[i1++] != l2[i2++])
-				return 0;
+				continue;
+			}
+			break;
 		}
-		return (i1 >= s1 && i2 >= s2);
+
+		/*
+		 * If we do not want -b to imply --ignore-space-at-eol
+		 * then you would need to add this:
+		 *
+		 * if (!(flags & XDF_IGNORE_WHITESPACE_AT_EOL))
+		 *	return (s1 <= i1 && s2 <= i2);
+		 *
+		 */
+
+		/*
+		 * we ran out one side; the remaining side must be all
+		 * whitespace to match.
+		 */
+		return remainder_all_ws(l1, l2, i1, i2, s1, s2);
+
 	} else if (flags & XDF_IGNORE_WHITESPACE_AT_EOL) {
-		for (i1 = i2 = 0; i1 < s1 && i2 < s2; ) {
-			if (l1[i1] != l2[i2]) {
-				while (i1 < s1 && isspace(l1[i1]))
-					i1++;
-				while (i2 < s2 && isspace(l2[i2]))
-					i2++;
-				if (i1 < s1 || i2 < s2)
-					return 0;
-				return 1;
-			}
-			i1++;
-			i2++;
+		while (1) {
+			if (i1 < s1 && i2 < s2 && l1[i1++] == l2[i2++])
+				continue;
+			break;
 		}
-		return i1 >= s1 && i2 >= s2;
+		/*
+		 * we ran out one side; the remaining side must be all
+		 * whitespace to match.
+		 */
+		return remainder_all_ws(l1, l2, i1, i2, s1, s2);
 	} else
 		return s1 == s2 && !memcmp(l1, l2, s1);
 }
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]