Re: [PATCH v5 1/1] cat-file: quote-format name in error when using -z

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Toon Claes <toon@xxxxxxxxx> writes:

> Ideally the output should be NUL-terminated if -z is used. This was also
> suggested[2] when the flag was introduced. Obviously we cannot change
> this now, because it would break behavior for *everyone* using -z, not
> only when funny names are used. So if we want to go this route, we
> should only do so with another flag (e.g. `--null-output`) or a config
> option.

Yes, `--null-output` came also to my mind.  As this new mode of
output is for consumption by programs, letting them read
NUL-terminated records is a viable, if cumbersome, possibility.

> But I was looking at the git-config(1) documentation:
>
>> core.quotePath::
>> 	Commands that output paths (e.g. 'ls-files', 'diff'), will
>> 	quote "unusual" characters in the pathname by enclosing the
>> 	pathname in double-quotes and escaping those characters with
>> 	backslashes in the same way C escapes control characters (e.g.
>> 	`\t` for TAB, `\n` for LF, `\\` for backslash) or bytes with
>> 	values larger than 0x80 (e.g. octal `\302\265` for "micro" in
>> 	UTF-8).  If this variable is set to false, bytes higher than
>> 	0x80 are not considered "unusual" any more. Double-quotes,
>> 	backslash and control characters are always escaped regardless
>> 	of the setting of this variable.  A simple space character is
>> 	not considered "unusual".  Many commands can output pathnames
>> 	completely verbatim using the `-z` option. The default value
>> 	is true.
>
> If you read this, the changes of this patch fully contradict this.

Hmph, I do not quite see where the contradiction is.  If you mean
"Many commands can output" part, I do not think it applies here.
First, your "cat-file" does not have to be a part of "many".  More
importantly, the mention of `-z` there is about the option accepted
by the diff family of commants, e.g. "git diff --name-only -z
HEAD^", that is an output record separator.  Your "-z" is about the
input record separator, and if you are not changing "-z" to suddenly
mean both input and output  separator to break existing scripts that
expect "-z" only applies to input, the above "completely verbatim"
does not apply to you.

> Also
> documentation on other commands (e.g. git-check-ignore(1)) using `-z`
> will mention the verbatim output.

Again, it is about the output.

Stepping back a bit, how big a problem is this in real life?  It
certainly is possible to create a pathname with funny byte values in
it, and in some environments,letters like single-quote that are
considered cumbersome to handle by those who are used to CLI
programs may be commonplace.  But a path with newline?  Or any
control character for that matter?  And this is not even the primary
output from the program but is an error message for consumption by
humans, no?

I am wondering if it is simpler to just declare that the paths
output in error messages have certain bytes, probably all control
characters other than HT, replaced with a dot, and tell the users
not to rely on the pathnames being intact if they contain funny
bytes in them.  That way, with the definition of "work" being "you
can read the path out of error messages that talk about it", paths
with bytes that c-quote mechanism butchers, like double quotes and
backslashes, that have worked before will not be broken, and paths
with LF or CRLF in them that have never worked would not work, but
at least does not break the input stream of whoever is reading the
error messages line by line.

I dunno.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux