Re: [PATCH for-maint] apply: gitdiff_verify_name(): accept "/dev/null\r"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/23/14 20:54, Junio C Hamano wrote:
> Laszlo Ersek <lersek@xxxxxxxxxx> writes:
> 
>>   git format-patch master..branch1
> 
> The output from this has these (excerpt from "od -xc" output):
> 
> 0000360       f   2  \n  \n   d   i   f   f       -   -   g   i   t
>            6620    0a32    640a    6669    2066    2d2d    6967    2074
> 0000400   a   /   f   2       b   /   f   2  \n   n   e   w       f   i
>            2f61    3266    6220    662f    0a32    656e    2077    6966
> 0000420   l   e       m   o   d   e       1   0   0   6   4   4  \n   i
>            656c    6d20    646f    2065    3031    3630    3434    690a
> 0000440   n   d   e   x       0   0   0   0   0   0   0   .   .   f   3
>            646e    7865    3020    3030    3030    3030    2e2e    3366
> 0000460   5   d   3   e   6  \n   -   -   -       /   d   e   v   /   n
>            6435    6533    0a36    2d2d    202d    642f    7665    6e2f
> 0000500   u   l   l  \n   +   +   +       b   /   f   2  \n   @   @
>            6c75    0a6c    2b2b    202b    2f62    3266    400a    2040
> 0000520   -   0   ,   0       +   1       @   @  \n   +   h   e   l   l
>            302d    302c    2b20    2031    4040    2b0a    6568    6c6c
> 0000540   o       w   o   r   l   d  \r  \n   -   -      \n   2   .   1
>            206f    6f77    6c72    0d64    2d0a    202d    320a    312e
> 
> The structural parts of the diff, including "--- /dev/null" line,
> are all terminated by "\n" (as they should be), and the only CR
> appears in the message is at the end of "+hello world" line.

That's right -- until the patch email goes through an MTA that turns all
line endings into CRLF. (Did you email the patch to yourself as
requested in the reproducer?)

Such CRLFs are normally transparent because git-am strips them. The
keepcr=true setting preserves them, but not only for the source code
lines (where it's the right thing to do): it also preserves them in the
git diff header lines. Which is not a problem in general, *except* when
said header line includes /dev/null.

> So I do not think apply should need to loosen its sanity check and
> take a random whitespace after the "/dev/null" as a valid "this is a
> creation event for the path" marker (e.g. "--- /dev/null whoa"?).
> 
> is_dev_null() is used to in the fallback code path that parses
> traditional patch output (e.g. GNU diff) which throws random cruft
> (e.g. timestamp) after the /dev/null marker, e.g.
> 
>     $ diff -u /dev/null f2
>     --- /dev/null   2014-09-17 18:22:57.995111003 -0700
>     +++ f2  2014-09-23 11:37:09.000000000 -0700
>     @@ -0,0 +1 @@
>     +hello world
> 
> and we'd be hesitant to allow that kind of looseness for Git patches
> where we know we end the line after the "/dev/null" marker.

Okay, let's say I only relax the check in question to accept "\r\n" in
addition to the current "\n". Will you take that?

>> 3. In the reviewer / tester / maintainer role, save the patch from your
>> email client to a local file. Assume that your email client does not
>> corrupt the patch when saving it.
> 
> Perhaps compare this saved file with the output from the above
> format-patch to see where things got broken?
> 
> SMTP transport may be CRLF-unsafe, so I have a suspicion that it may
> turn out that what you are trying to do might be an equilvalent of
> 
> 	git format-patch ... |
>         # first lose all \r\n
>         dos2unix | 
> 	# then make everything \r\n
>         unix2dos |
>         # and apply
>         git am
> 
> which is not workable in the first place.  I dunno.

I agree with your analysis. It is indeed the MTA (or the MUA, not sure)
that turns all line endings into uniform CRLFs -- it is a requirement in
RFC 2822 / 822bis.

http://cr.yp.to/docs/smtplf.html
http://www.rfc-editor.org/rfc/rfc2822.txt

> 2.3. Body
>
>    The body of a message is simply lines of US-ASCII characters.  The
>    only two limitations on the body are as follows:
>
>    - CR and LF MUST only occur together as CRLF; they MUST NOT appear
>      independently in the body.

But why is this situation "not workable"? The same happens with *all*
patches that people mail around, it's just not visible to them, because
git-am strips all CRs indiscriminately.

People who are forced to work with CRLF repositories don't have this
luxury with git-am, and bump into the /dev/null problem all the time.

What do you think about accepting only "/dev/null\n" and "/dev/null\r\n"?

Another question I had about gitdiff_verify_name() -- what ensures there
that the memcmp(), with the fixed size of 9 bytes, won't fall off the
end of the line? Some of the outer caller functions verify the line
length before their comparisons, but I don't see any length checks in
gitdiff_verify_name() for /dev/null specifically.

Thank you,
Laszlo
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]