Toon Claes <toon@xxxxxxxxx> writes: > Ideally the output should be NUL-terminated if -z is used. This was also > suggested[2] when the flag was introduced. Obviously we cannot change > this now, because it would break behavior for *everyone* using -z, not > only when funny names are used. So if we want to go this route, we > should only do so with another flag (e.g. `--null-output`) or a config > option. Yes, `--null-output` came also to my mind. As this new mode of output is for consumption by programs, letting them read NUL-terminated records is a viable, if cumbersome, possibility. > But I was looking at the git-config(1) documentation: > >> core.quotePath:: >> Commands that output paths (e.g. 'ls-files', 'diff'), will >> quote "unusual" characters in the pathname by enclosing the >> pathname in double-quotes and escaping those characters with >> backslashes in the same way C escapes control characters (e.g. >> `\t` for TAB, `\n` for LF, `\\` for backslash) or bytes with >> values larger than 0x80 (e.g. octal `\302\265` for "micro" in >> UTF-8). If this variable is set to false, bytes higher than >> 0x80 are not considered "unusual" any more. Double-quotes, >> backslash and control characters are always escaped regardless >> of the setting of this variable. A simple space character is >> not considered "unusual". Many commands can output pathnames >> completely verbatim using the `-z` option. The default value >> is true. > > If you read this, the changes of this patch fully contradict this. Hmph, I do not quite see where the contradiction is. If you mean "Many commands can output" part, I do not think it applies here. First, your "cat-file" does not have to be a part of "many". More importantly, the mention of `-z` there is about the option accepted by the diff family of commants, e.g. "git diff --name-only -z HEAD^", that is an output record separator. Your "-z" is about the input record separator, and if you are not changing "-z" to suddenly mean both input and output separator to break existing scripts that expect "-z" only applies to input, the above "completely verbatim" does not apply to you. > Also > documentation on other commands (e.g. git-check-ignore(1)) using `-z` > will mention the verbatim output. Again, it is about the output. Stepping back a bit, how big a problem is this in real life? It certainly is possible to create a pathname with funny byte values in it, and in some environments,letters like single-quote that are considered cumbersome to handle by those who are used to CLI programs may be commonplace. But a path with newline? Or any control character for that matter? And this is not even the primary output from the program but is an error message for consumption by humans, no? I am wondering if it is simpler to just declare that the paths output in error messages have certain bytes, probably all control characters other than HT, replaced with a dot, and tell the users not to rely on the pathnames being intact if they contain funny bytes in them. That way, with the definition of "work" being "you can read the path out of error messages that talk about it", paths with bytes that c-quote mechanism butchers, like double quotes and backslashes, that have worked before will not be broken, and paths with LF or CRLF in them that have never worked would not work, but at least does not break the input stream of whoever is reading the error messages line by line. I dunno.