Re: [PATCH resend] git-send-email: Use sanitized address when reading mbox body

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Csókás Bence <csokas.bence@xxxxxxxxx> writes:

> On 6/21/24 19:27, Junio C Hamano wrote:
>> Hmph, there is this piece of code before the block:
>> 			if ($c !~ /.+@.+|<.+>/) {
>> 				printf("(body) Ignoring %s from line '%s'\n",
>> 					$what, $_) unless $quiet;
>> 				next;
>> 			}
>> This is to reject strings that do not even look like an e-mail
>> address, but we called sanitize_address() call on $c way before this
>> check.  I wonder if we should move this block way up, even before
>> the call to santize_address()?
>> That was a relatively unrelated tangent.
>
> Hm, maybe. Should I add this to the patch as well?

No, it is an unrelated tangent.  We try not to mix unrelated things
into a single patch.  Perhaps as a separate "clean-up" patch after
the dust from the main topic settles, or as a preliminary "clean-up"
before the main topic, would be good (but see below).

>> It looks to me that there are many other places that we try to be as
>> faithful to the original as possible.  In the same block as the one
>> that handles "Cc:" I quoted above, an address on "From:" is also
>> sent intact into @cc and addresses on "To:" are handled the same
>> way.
>> The patch under discussion singles out the addresses on the trailers
>> in the message body and treat them differently from others, which I
>> am not sure is what we want to do.
>
> I think it is a reasonable assumption that the mbox headers will be
> conforming, whereas the message body is just freeform text and no such
> guarantees exist. But if we want to be paranoid about it, we could try
> and sanitize everything.

I actually do not think we want to be more paranoid.  I'd rather
trust what the user gave us, which was where my response came from.

Having said that, given that the "Ignoring ..." check we saw earlier
in this message triggers only for trailers, it may be a sensible
position to take that mail headers are more trustworthy and a random
address-looking strings on the trailer lines should be checked more
carefully.  Because the "Ignoring ..." check is primarily to reject
strings that are not addresses (things like bug IDs, or just
people's names without e-mail addresses to credit for a new feature
request) that we do not ever intend to actually Cc: the message to,
another sensible choice coming from that "strings on trailers may
not even be addresses" position may be not to add the $c to the Cc:
list, if $sc (the sanitized address) and $c (the original address)
are different.  That is, "the simple regexp check currently used to
trigger 'Ignoring ...' message thought that the string looked like
an address, but when we ask sanitize_address, it turns out that it
was not, really."  And if we were to take that route, "Ignoring ..."
check, which I called an unrelated tangent, would become very much
related to the topic at hand, as it would mean the resulting logic
would look more like this:

	my $sc = sanitize_address($c);
	if ($c !~ /.+@.+|<.+>/ || $sc ne $c) {
        	printf("(body) Ignoring %s from line '%s'\n",
		       $what, $_) unless $quiet;
		next;
	}

In any case, if we were to move forward with this topic (whether
"send to corrected $sc instead" or "$c is suspicious, do not add it
to $cc" is picked as the direction), the motivation behind the
design decision to treat the address taken from a trailer line
differently needs to be explained better, I think.  I had a chance
to ask you directly on list, and you gave a good explanation to me
in the message I am responding to, but you will not be around when
future readers of "git log" want to ask the same question "why are
we singling this out while we use the original address in all the
other places?".

Your proposed commit log message is the place to help them.

Oh, before I forget, is this something we can test easily in t9001?
We would want to protect a new behaviour from accidental breakage,
so adding a test or two would be a good thing.

Thanks.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux