Re: git-am doesn't strip CRLF line endings when the mbox is base64-encoded

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



George Dunlap <george.dunlap@xxxxxxxxxx> writes:

> On 12/18/19 12:15 PM, George Dunlap wrote:
>> On 12/18/19 11:42 AM, George Dunlap wrote:
>>> Using git 2.24.0 from Debian testing.
>>>
>>> It seems that git-am will strip CRLF endings from mails before applying
>>> patches when the mail isn't encoded in any way.  It will also decode
>>> base64-encoded mails.  But it won't strip CRLF endings from
>>> base64-encoded mails.
>>>
>>> Attached are two mbox files for two different recent series.
>>> plainenc.am applies cleanly with `git am`, while base64enc.am doesn't.
>>>
>>> Poking around the man pages, it looks like part of the issue might be
>>> that the CRLF stripping is done in `git mailsplit`, before the base64
>>> encoding, rather than after.
>> 
>> Poking around -- it looks like the CRLF stripping would be better done
>> in `git mailinfo` after the decoding.
>
> Anyone want to take this up?  I mean, I could try to send a patch, but
> since I've never looked at the git source code before, I'm sure it would
> take me about 10x as much effort for me to do it as for someone already
> familiar with the codebase.

Even before writing a patch, somebody needs to come up with a
sensible design first.  --[no-]keep-cr is about "because transfer of
e-mail messages between MTAs and to the receiving MUA is defined in
terms of CRLF delimited lines per RFC, Git cannot tell if the CRLF
in the input was meant to be part of the patch (i.e. the diff is
describing a change between preimage and postimage of a file that
uses CRLF line endings) or they are cruft added during transit.  By
default we favor LF endings so we will strip, but we leave an option
to keep CRs at the end of lines".  

What you are asking for is quite different, isn't it?  "We know the
CRLF in the payload is from the original because they were protected
from getting munged during the transfer by being MIME-encased.
Please tell Git to preprocess that payload to convert CRLF to LF
before treating it as a patch".

So, if you are imagining to change the meaning of --[no-]keep-cr, I
do not think it will fly (that is why I said that we need a sensible
design before a patch).

And by stepping back a bit like so, and once we start viewing this
as "after receiving a piece of e-mail from MUA (where --[no-]keep-cr
may affect the outermost CRLF line endings) and unwrapping possible
MIME-encasing, we can optionally tell Git to pass the payload
further through a preprocess filter", we'd realize that this does
not have to be limited to just running dos2unix (you may want to run
iconv to fix encodings, for example), which would mean that the new
flag may not just want to be --strip-cr, which is too limiting, but
rather want to be --filter-message=<how> where <how> could be one of
the canned preprocess filter (among which your dos2unix may exist)
or an external script.

I am not saying that "--filter-message=<how>" must be the "sensible
design" I mentioned at the beginning of this message---the above is
to illustrate what kind of thought needs to go in before even the
first line of the patch gets written.

Thanks.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux