Re: b4: unicode control characters -- warn or remove?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 01 2021, Eric Wong wrote:

> Konstantin Ryabitsev <konstantin@xxxxxxxxxxxxxxxxxxx> wrote:
>> Hi, all:
>> 
>> Per exhibit a, what should we do in the situation where we discover unicode
>> control characters in an email?
>> 
>> 1. Warn and strip these chars out, because they are extremely unlikely to be
>>    doing anything legitimate in the context of a patch (unless someone is
>>    sending patches for docs actually written in RTL languages)
>> 2. Warn and error out, refusing to produce an mbox
>> 3. Just warn and produce an mbox anyway
>> 
>> I'd normally do #3, but with many people piping things to git-am, I'm not sure
>> if it's the safest choice.
>> 
>> Exibit a: https://lwn.net/Articles/874546/
>
> +Cc: git@vger
>
> IMHO, defense for this belongs in git-am (which already checks
> things like whitespace).

It checks whitespace because that's something that's commonly a source
of patch corruption. I'm not adverse to adding this to core.whitespace,
but trying to catch malicious injected code seems like a rather big
expansion of its scope, particularly since:

    "[...]sending patches for docs actually written in RTL languages[...]"

Or just code? People write comment and even in their native languages,
and not all projects are as anglo-centric as those hosted on kernel.org.

I haven't checked what the overlap is between solving this issue & i18n
support, but we definitely should not be assuming that git's only using
by kernel.org users & similar, even something as relatively obscure as
git-am.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux