On Mon, Dec 02, 2024 at 01:41:07PM -0400, Joey Hess wrote: > Apparently "Icon\r" is a common filename on OSX, anyway it's a legal > unix filename. It seems that sending a line containing that filename to > git hash-object --stdin-paths triggers some DOS-style CRLF handling. > Here I am running git version 2.45.2 on Linux. > > $ touch Icon^M > $ printf 'Icon\r\n' | git hash-object --stdin-paths > fatal: could not open 'Icon' for reading: No such file or directory > > $ echo 'wrong file!' > Icon > $ printf 'Icon\r\n' | git hash-object --stdin-paths > 1c43b74a7787621318ee7442eb5a36e32476f326 > > While looking at builtin/hash-object.c to see why it might do this, I quickly > noticed another odd behavior: > > $ touch '"foo"' > $ printf '"foo"\n' | git hash-object --stdin-paths > fatal: could not open 'foo' for reading: No such file or directory > > $ touch '"foo' > $ printf '"foo\n' | git hash-object --stdin-paths > fatal: line is badly quoted > > The documentation does not seem to mention that quoted lines in > --stdin-paths are at all special. Of course, quoting would be one way to > work around the CRLF problem, if it were documented. Indeed -- the documentation does not meniton quoting at all, but we do use `unquote_c_style()` to parse paths. So the following works: $ echo foobar >"$(printf 'something\n\rsomething')" $ printf 'something\n\rsomething' | git hash-object --stdin-paths fatal: could not open 'something' for reading: No such file or directory $ printf '"something\\n\\rsomething"' | git hash-object --stdin-paths 323fae03f4606ea9991df8befbb2fca795e648fa Note that you have to escape both "\n" and "\r", and then Git handles unquoting for you. This really needs documentation though. > It seems that some parts of git that read filenames from stdin use > strbuf_getline_lf and others use strbuf_getdelim_strip_crlf. There does > not seem to be any consistency, and my impression is any user is best > off using -z, when the command supports it, to avoid the mess. > > Given all that, maybe adding -z to hash-object would be a good "fix". I think this is a good idea regardless of whether we document the quoting behaviour or not. It is way easier for programs to embed NUL characters than having to handle the quoting rules implemented by Git. Patrick