Re: [PATCH 1/2] git apply: option to ignore whitespace differences

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> diff --git a/builtin-apply.c b/builtin-apply.c
> index dc0ff5e..86860d6 100644
> --- a/builtin-apply.c
> +++ b/builtin-apply.c
> @@ -39,6 +39,7 @@ static int diffstat;
>  static int numstat;
>  static int summary;
>  static int check;
> +static int ignore_whitespace = 0;

s/ = 0//;

> +/*
> + * Compare two memory areas ignoring whitespace differences
> + */
> +static int memcmp_ignore_whitespace(const char *s1, const char *s2, size_t n)
> +{

Don't you think this function signature is bogus?

You are going to use this function to compare a line that is in the target
buffer (i.e. the contents of the current blob) with a line read from the
patch context when the user says they might match if you ignore whitespace
differences.

How on earth can you do that with only a single length parameter "n"?
Length of which side are you talking about?

Remember that git-apply is designed to be able to handle a patch that has
NUL in it (generated with "diff --text"), so strlen() is not an acceptable
answer.

> +static int memcmp_switch(const char *s1, const char *s2, size_t n)
> +{
> +	if (ignore_whitespace)
> +		return memcmp_ignore_whitespace(s1, s2, n);
> +	else
> +		return memcmp(s1, s2, n);
> +}
> +

This function signature shares the same bogosity with the previous one.

In addition, it's name and semantics is bogus.

The original implementation had encapsulated the notion that it is
comparing two lines entirely in match_fragment().  Its use of memcmp() was
an implementation detail of a half of the open-coded logic to compare two
lines (the other half being the comparison of lengths), and direct use of
memcmp() made perfect sense.  It checked the length matched.  Now it
wanted to make sure they matched byte-for-byte.

If you are separating that "compare two lines" logic into a helper
function, I would expect its name actually reflect its purpose, whose
behaviour may change depending on --ignore-whitespace.  The traditional
codepath would say "do they have the same length and match byte-for-byte?"
while the new loosened codepath would say "we do not care about the
whitespaces; do they match if we disregard ws differences?"

I also suspect that you might be able to optimize the existing "allow
whitespace-fixed match" a bit by "fix"ing the target buffer only once,
instead of doing so line-by-line for every offset that find_pos() checks
by calling match_fragment().  It is an independent issue, but it appears
to me that the change this patch wants to do to match_fragment() might
become cleaner if we did that conversion first, as match_fragment() itself
won't have to have two cases (early return for exact match, and match with
whitespace-fixed target).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]