Re: [PATCH-v2/RFC 2/6] xutils: fix hash with whitespace on incomplete line

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thell Fowler <git@xxxxxxxxxxxxx> writes:

>   - Make xdl_hash_record_with_whitespace stop hashing before the
>     eof when ignoring space change or space at eol on an incomplete
>     line.
>
>   Resolves issue with a final trailing space being included in the
>   hash on an incomplete line by treating the eof in the same fashion
>   as a newline.

Please study the style of existing commit messages and imitate them.

> Signed-off-by: Thell Fowler <git@xxxxxxxxxxxxx>
> ---
>  xdiff/xutils.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/xdiff/xutils.c b/xdiff/xutils.c
> index 04ad468..c6512a5 100644
> --- a/xdiff/xutils.c
> +++ b/xdiff/xutils.c
> @@ -248,12 +248,12 @@ static unsigned long xdl_hash_record_with_whitespace(char const **data,
>  			if (flags & XDF_IGNORE_WHITESPACE)
>  				; /* already handled */
>  			else if (flags & XDF_IGNORE_WHITESPACE_CHANGE
> -					&& ptr[1] != '\n') {
> +					&& ptr[1] != '\n' && ptr + 1 < top) {
>  				ha += (ha << 5);
>  				ha ^= (unsigned long) ' ';
>  			}
>  			else if (flags & XDF_IGNORE_WHITESPACE_AT_EOL
> -					&& ptr[1] != '\n') {
> +					&& ptr[1] != '\n' && ptr + 1 < top) {
>  				while (ptr2 != ptr + 1) {
>  					ha += (ha << 5);
>  					ha ^= (unsigned long) *ptr2;

Thanks.

The issue you identified and tried to fix is a worthy one.  But before the
pre-context of this hunk, I notice these lines:

		if (isspace(*ptr)) {
			const char *ptr2 = ptr;
			while (ptr + 1 < top && isspace(ptr[1])
					&& ptr[1] != '\n')
				ptr++;

If you have trailing whitespaces on an incomplete line, ptr initially
points at the first such whitespace, ptr2 points at the same location, and
then the while() loop advances ptr to point at the last byte on the line,
which in turn will be the last byte of the file.  And the codepath with
your updates still try to access ptr[1] that is beyond that last byte.

I would write it like this patch instead.

The intent is the same as your patch, but it avoids accessing ptr[1] when
that is beyond the end of the buffer, and the logic is easier to follow as
well.

-- >8 --
Subject: xutils: fix hashing an incomplete line with whitespaces at the end

Upon seeing a whitespace, xdl_hash_record_with_whitespace() first skipped
the run of whitespaces (excluding LF) that begins there, ensuring that the
pointer points the last whitespace character in the run, and assumed that
the next character must be LF at the end of the line.  This does not work
when hashing an incomplete line, that lacks the LF at the end.

Introduce "at_eol" variable that is true when either we are at the end of
line (looking at LF) or at the end of an incomplete line, and use that
instead throughout the code.

Noticed by Thell Fowler.

Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx>
---
 xdiff/xutils.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 04ad468..9411fa9 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -242,18 +242,20 @@ static unsigned long xdl_hash_record_with_whitespace(char const **data,
 	for (; ptr < top && *ptr != '\n'; ptr++) {
 		if (isspace(*ptr)) {
 			const char *ptr2 = ptr;
+			int at_eol;
 			while (ptr + 1 < top && isspace(ptr[1])
 					&& ptr[1] != '\n')
 				ptr++;
+			at_eol = (top <= ptr + 1 || ptr[1] == '\n');
 			if (flags & XDF_IGNORE_WHITESPACE)
 				; /* already handled */
 			else if (flags & XDF_IGNORE_WHITESPACE_CHANGE
-					&& ptr[1] != '\n') {
+				 && !at_eol) {
 				ha += (ha << 5);
 				ha ^= (unsigned long) ' ';
 			}
 			else if (flags & XDF_IGNORE_WHITESPACE_AT_EOL
-					&& ptr[1] != '\n') {
+				 && !at_eol) {
 				while (ptr2 != ptr + 1) {
 					ha += (ha << 5);
 					ha ^= (unsigned long) *ptr2;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]