Matthieu Moy <git@xxxxxxxxxxxxxxx> writes: > This is a followup over 9d33439 (send-email: only allow one address > per body tag, 2017-02-20). The first iteration did allow writting > > Cc: <foo@xxxxxxxxxxx> # garbage > > but did so by matching the regex ([^>]*>?), i.e. stop after the first > instance of '>'. However, it did not properly deal with > > Cc: foo@xxxxxxxxxxx # garbage > > Fix this using a new function strip_garbage_one_address, which does > essentially what the old ([^>]*>?) was doing, but dealing with more > corner-cases. Since we've allowed > > Cc: "Foo # Bar" <foobar@xxxxxxxxxxx> > > in previous versions, it makes sense to continue allowing it (but we > still remove any garbage after it). OTOH, when an address is given > without quoting, we just take the first word and ignore everything > after. > > Signed-off-by: Matthieu Moy <git@xxxxxxxxxxxxxxx> > --- > Also available as: https://github.com/git/git/pull/398 > > git-send-email.perl | 26 ++++++++++++++++++++++++-- > t/t9001-send-email.sh | 4 ++++ > 2 files changed, 28 insertions(+), 2 deletions(-) > > diff --git a/git-send-email.perl b/git-send-email.perl > index fa6526986e..33a69ffe5d 100755 > --- a/git-send-email.perl > +++ b/git-send-email.perl > @@ -1089,6 +1089,26 @@ sub sanitize_address { > > } > > +sub strip_garbage_one_address { > + my ($addr) = @_; > + chomp $addr; > + if ($addr =~ /^(("[^"]*"|[^"<]*)? *<[^>]*>).*/) { > + # "Foo Bar" <foobar@xxxxxxxxxxx> [possibly garbage here] > + # Foo Bar <foobar@xxxxxxxxxxx> [possibly garbage here] > + return $1; > + } > + if ($addr =~ /^(<[^>]*>).*/) { > + # <foo@xxxxxxxxxxx> [possibly garbage here] > + # if garbage contains other addresses, they are ignored. > + return $1; > + } Isn't this already covered by the first one, which allows an optional "something", followed by an optional run of SPs, in front of this exact pattern, so the case where the optional "something" does not appear and the number of optional SP is zero would exactly match the one this pattern is meant to cover. > + if ($addr =~ /^([^"#,\s]*)/) { > + # address without quoting: remove anything after the address > + return $1; > + } > + return $addr; > +} By the way, these three regexps smell like they were written specifically to cover three cases you care about (perhaps the ones in your proposed log message), but what will be our response when somebody else comes next time to us and says that their favourite formatting of "Cc:" line is not covered by these rules? Will we add yet another pattern? Where will it end? There will be a point where we instead start telling them to update the convention of their project so that it will be covered by one of the patterns we have already developed, I would imagine. So, from that point of view, I, with devil's advocate hat on, wonder why we are not saying "Cc: s@xxxxx # cruft"? Use "Cc: <s@xxxxx> # cruft" instead and you'd be fine. right now, without this patch. I do not _mind_ us trying to be extra nice for a while, and I certainly do not mind _this_ particular patch that gives us a single helper function that future "here is another way to spell cruft" rules can go, but I feel that there should be some line that lets us say that we've done enough. Thanks.