Re: [PATCH] diff.c: increment buffer pointer in all code path

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 19, 2017 at 02:30:08PM +0900, Junio C Hamano wrote:

> Jeff King <peff@xxxxxxxx> writes:
> 
> > it does. It just adjusts our "end pointer" to point to the last valid
> > character in the string (rather than one past), which seems to be the
> > convention that those loops (and next_byte) expect.
> 
> Yeah I am not sure if I like this comparison at the beginning of the
> function:
> 
>         static int next_byte(const char **cp, const char **endp,
>                              const struct diff_options *diffopt)
>         {
>                 int retval;
> 
>                 if (*cp > *endp)
>                         return -1;
> 
> but it says endp _is_ part of valid input, contrary to my intuition.

Actually, I think even this function is confused about its convention.
In the line you quote, we clearly treat *endp as part of the input. But
later we do:

  while (*cp < *endp && isspace(**cp))
	  (*cp)++;

meaning that we'd fail to soak up whitespace at *endp. That wouldn't be
so bad if not for the other bug which fails to eat whitespace at endp in
the first place. :)

So I think the right fix is this:

diff --git a/diff.c b/diff.c
index 6fd288420b..09081a207c 100644
--- a/diff.c
+++ b/diff.c
@@ -712,7 +712,7 @@ static int next_byte(const char **cp, const char **endp,
 {
 	int retval;
 
-	if (*cp > *endp)
+	if (*cp >= *endp)
 		return -1;
 
 	if (isspace(**cp)) {
@@ -729,7 +729,12 @@ static int next_byte(const char **cp, const char **endp,
 		if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE)) {
 			while (*cp < *endp && isspace(**cp))
 				(*cp)++;
-			/* return the first non-ws character via the usual below */
+			/*
+			 * return the first non-ws character via the usual
+			 * below, unless we ate all of the bytes
+			 */
+			if (*cp >= *endp)
+				return -1;
 		}
 	}
 
@@ -750,9 +755,9 @@ static int moved_entry_cmp(const struct diff_options *diffopt,
 		return a->es->len != b->es->len  || memcmp(ap, bp, a->es->len);
 
 	if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE_AT_EOL)) {
-		while (ae > ap && isspace(*ae))
+		while (ae > ap && isspace(ae[-1]))
 			ae--;
-		while (be > bp && isspace(*be))
+		while (be > bp && isspace(be[-1]))
 			be--;
 	}
 
@@ -775,7 +780,7 @@ static unsigned get_string_hash(struct emitted_diff_symbol *es, struct diff_opti
 		int c;
 
 		strbuf_reset(&sb);
-		while (ae > ap && isspace(*ae))
+		while (ae > ap && isspace(ae[-1]))
 			ae--;
 		while ((c = next_byte(&ap, &ae, o)) > 0)
 			strbuf_addch(&sb, c);

It's late here, so I'll wait for comments from Stefan and then try to
wrap it up with a commit message and test tomorrow.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux