Re: [RFC PATCH 1/2] send-email: fix garbage removal after address

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Matthieu Moy <git@xxxxxxxxxxxxxxx> writes:

> This is a followup over 9d33439 (send-email: only allow one address
> per body tag, 2017-02-20). The first iteration did allow writting
>
>   Cc: <foo@xxxxxxxxxxx> # garbage
>
> but did so by matching the regex ([^>]*>?), i.e. stop after the first
> instance of '>'. However, it did not properly deal with
>
>   Cc: foo@xxxxxxxxxxx # garbage
>
> Fix this using a new function strip_garbage_one_address, which does
> essentially what the old ([^>]*>?) was doing, but dealing with more
> corner-cases. Since we've allowed
>
>   Cc: "Foo # Bar" <foobar@xxxxxxxxxxx>
>
> in previous versions, it makes sense to continue allowing it (but we
> still remove any garbage after it). OTOH, when an address is given
> without quoting, we just take the first word and ignore everything
> after.
>
> Signed-off-by: Matthieu Moy <git@xxxxxxxxxxxxxxx>
> ---
> Also available as: https://github.com/git/git/pull/398
>
>  git-send-email.perl   | 26 ++++++++++++++++++++++++--
>  t/t9001-send-email.sh |  4 ++++
>  2 files changed, 28 insertions(+), 2 deletions(-)
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index fa6526986e..33a69ffe5d 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -1089,6 +1089,26 @@ sub sanitize_address {
>  
>  }
>  
> +sub strip_garbage_one_address {
> +	my ($addr) = @_;
> +	chomp $addr;
> +	if ($addr =~ /^(("[^"]*"|[^"<]*)? *<[^>]*>).*/) {
> +		# "Foo Bar" <foobar@xxxxxxxxxxx> [possibly garbage here]
> +		# Foo Bar <foobar@xxxxxxxxxxx> [possibly garbage here]
> +		return $1;
> +	}
> +	if ($addr =~ /^(<[^>]*>).*/) {
> +		# <foo@xxxxxxxxxxx> [possibly garbage here]
> +		# if garbage contains other addresses, they are ignored.
> +		return $1;
> +	}

Isn't this already covered by the first one, which allows an
optional "something", followed by an optional run of SPs, in front
of this exact pattern, so the case where the optional "something"
does not appear and the number of optional SP is zero would exactly
match the one this pattern is meant to cover.

> +	if ($addr =~ /^([^"#,\s]*)/) {
> +		# address without quoting: remove anything after the address
> +		return $1;
> +	}
> +	return $addr;
> +}

By the way, these three regexps smell like they were written
specifically to cover three cases you care about (perhaps the ones
in your proposed log message), but what will be our response when
somebody else comes next time to us and says that their favourite
formatting of "Cc:" line is not covered by these rules?  

Will we add yet another pattern?  Where will it end?  There will be
a point where we instead start telling them to update the convention
of their project so that it will be covered by one of the patterns
we have already developed, I would imagine.

So, from that point of view, I, with devil's advocate hat on, wonder
why we are not saying

	"Cc: s@xxxxx # cruft"?  Use "Cc: <s@xxxxx> # cruft" instead
	and you'd be fine.

right now, without this patch.

I do not _mind_ us trying to be extra nice for a while, and I
certainly do not mind _this_ particular patch that gives us a single
helper function that future "here is another way to spell cruft"
rules can go, but I feel that there should be some line that lets us
say that we've done enough.

Thanks.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux