Re: infelicities in git hash-object --stdin-paths with special characters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 02, 2024 at 01:41:07PM -0400, Joey Hess wrote:
> Apparently "Icon\r" is a common filename on OSX, anyway it's a legal
> unix filename. It seems that sending a line containing that filename to
> git hash-object --stdin-paths triggers some DOS-style CRLF handling.
> Here I am running git version 2.45.2 on Linux.
> 
> $ touch Icon^M
> $ printf 'Icon\r\n' | git hash-object --stdin-paths
> fatal: could not open 'Icon' for reading: No such file or directory
> 
> $ echo 'wrong file!' > Icon
> $ printf 'Icon\r\n' | git hash-object --stdin-paths
> 1c43b74a7787621318ee7442eb5a36e32476f326
> 
> While looking at builtin/hash-object.c to see why it might do this, I quickly
> noticed another odd behavior:
> 
> $ touch '"foo"'
> $ printf '"foo"\n' | git hash-object --stdin-paths
> fatal: could not open 'foo' for reading: No such file or directory
> 
> $ touch '"foo'
> $ printf '"foo\n' | git hash-object --stdin-paths
> fatal: line is badly quoted
> 
> The documentation does not seem to mention that quoted lines in
> --stdin-paths are at all special. Of course, quoting would be one way to
> work around the CRLF problem, if it were documented.

Indeed -- the documentation does not meniton quoting at all, but we do
use `unquote_c_style()` to parse paths. So the following works:

    $ echo foobar >"$(printf 'something\n\rsomething')"
    $ printf 'something\n\rsomething' | git hash-object --stdin-paths
    fatal: could not open 'something' for reading: No such file or directory
    $ printf '"something\\n\\rsomething"' | git hash-object --stdin-paths
    323fae03f4606ea9991df8befbb2fca795e648fa

Note that you have to escape both "\n" and "\r", and then Git handles
unquoting for you. This really needs documentation though.

> It seems that some parts of git that read filenames from stdin use
> strbuf_getline_lf and others use strbuf_getdelim_strip_crlf. There does
> not seem to be any consistency, and my impression is any user is best
> off using -z, when the command supports it, to avoid the mess.
> 
> Given all that, maybe adding -z to hash-object would be a good "fix".

I think this is a good idea regardless of whether we document the
quoting behaviour or not. It is way easier for programs to embed NUL
characters than having to handle the quoting rules implemented by Git.

Patrick




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux