Re: [PATCH-v2/RFC 3/6] xutils: fix ignore-all-space on incomplete line

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano (gitster@xxxxxxxxx) wrote on Aug 23, 2009:

> Thell Fowler <git@xxxxxxxxxxxxx> writes:
> 
> > @@ -191,12 +191,14 @@ int xdl_recmatch(const char *l1, long s1, const char *l2, long s2, long flags)
> >  	int i1, i2;
> >  
> >  	if (flags & XDF_IGNORE_WHITESPACE) {
> > -		for (i1 = i2 = 0; i1 < s1 && i2 < s2; ) {
> > +		for (i1 = i2 = 0; i1 < s1 || i2 < s2; ) {
> >  			if (isspace(l1[i1]))
> > -				while (isspace(l1[i1]) && i1 < s1)
> > +				while ((isspace(l1[i1]) && i1 < s1)
> > +						|| (i1 + 1 == s1 && l1[s1] != '\n'))
> 
> This is wrong.  If you ran out l1/s1/i1 but you still have remaining
> characters in l2/s2/i2, you do not want to even look at l1[i1].
> 
> You can fudge this by sprinkling more "(i1 < s1) &&" in many places (and
> reordering how your inner while() loop checks (i1 < s1) and l1[i1]), but I
> do not think that is the right direction.
> 
> The thing is, the loop control in this function is extremely hard to read
> to begin with, and now it is "if we haven't run out both", the complexity
> seeps into the inner logic.
> 

I see what you're saying here and your absolutely right.  Good thing you 
didn't write a critique of the XDF_IGNORE_WHITESPACE_CHANGE case. ;)

> How about doing it like this patch instead?  This counterproposal replaces
> your 3 patches starting from [3/6].
[...snip...]
> The basic idea of the re-written logic is this.
> 
>  - An initial loop runs while the characters from both strings we are
>    looking at match.  We declare unmatch immediately when we find
>    something that does not match and return false from the loop.  And we
>    break out of the loop if we ran out of either side of the string.
> 
>    The way we skip spaces inside this loop varies depending on the style
>    of ignoring whitespaces.
> 
>  - After the loop, the lines can match only if the remainder consists of
>    nothing but whitespaces.  This part of the logic is shared across all
>    three styles.
> 
> The new code is more obvious and should be much easier to follow.

Because the flow is much more direct it also makes the test additions to 
t4015 obsolete as they essentially tested for line end conditions instead 
of whitespace (like they should have).
[...clip...]
> +		/*
> +		 * If we do not want -b to imply --ignore-space-at-eol
> +		 * then you would need to add this:
> +		 *
> +		 * if (!(flags & XDF_IGNORE_WHITESPACE_AT_EOL))
> +		 *	return (s1 <= i1 && s2 <= i2);
> +		 *
> +		 */
> +

While it would be nice to have -b and --ignore-space-at-eol be two 
different options that could be merged together the documentation states 
that -b ignores spaces at eol, and there are scripts that depend on this 
behavior.

IMHO  it is wrong to accept that new spaces where none existed before is 
akin to having one or more existing spaces coalesced.  I seem to recall 
reading something about 1.7 having some changes in it that wouldn't be 
backward compatible; perhaps -b and --ignore-space-at-eol could be 
distinct options for that release.

On another item:
Right now the xdl_recmatch() checks three distinct flags before having the 
opportunity to do the default behavior of a straight diff.  In 
xdl_hash_record there is an initial check for whitespace flags.

...
	if (flags & XDF_WHITESPACE_FLAGS)
		return xdl_hash_record_with_whitespace(data, top, flags);
...

Perhaps a similar setup for xdl_rematch() and a 
xdl_recmatch_with_whitespace() ?

Lastly:
Since your to counter-proposals give the same results, provide safer and 
faster processing, eliminate the additional test, as well as being easier 
to read and comprehend I propose a v3 with just those two patches.  I'll 
be glad to post it, with or without a xdl_recmatch_with_whitespace, if 
need be.  And should I, or do I need to, add something to the commit (ie: 
ack, tested, ...) ?

Thank you again for taking the time to look at this change!

-- 
Thell
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]