Re: [PATCH v5 1/1] cat-file: quote-format name in error when using -z

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Junio

Sorry for the slow reply, I had intended to reply before but got distracted and forgot about it.

On 15/05/2023 18:20, Junio C Hamano wrote:
Phillip Wood <phillip.wood123@xxxxxxxxx> writes:

On 12/05/2023 17:57, Junio C Hamano wrote:
Toon Claes <toon@xxxxxxxxx> writes:
Stepping back a bit, how big a problem is this in real life?  It
certainly is possible to create a pathname with funny byte values in
it, and in some environments,letters like single-quote that are
considered cumbersome to handle by those who are used to CLI
programs may be commonplace.  But a path with newline?  Or any
control character for that matter?  And this is not even the primary
output from the program but is an error message for consumption by
humans, no?
I am wondering if it is simpler to just declare that the paths
output in error messages have certain bytes, probably all control
characters other than HT, replaced with a dot, and tell the users
not to rely on the pathnames being intact if they contain funny
bytes in them.

We could only c-quote the name when it contains a control character
other that HT. That way names containing double quotes and backslashes
are unchanged but it will still be possible to parse the path from the
error message. If we're going to munge the name we might as well use
our standard quoting rather than some ad-hoc scheme.

In the above suggestion, I gave up and no longer aim to do
"quoting".  A more appropriate word for the approach is "redacting".
The message essentially is: If you use truly problematic bytes in
your path, they are redacted (so do not use them if it hurts).

This is because I am not sure how "names containing dq and bs are
unchanged" can be done without ambiguity.

D'oh, I should have thought of that. You're right it ends up being ambiguous. Anyway Patrick has just posted a patch to add NUL terminated output which looks like a cleaner approach.

Best Wishes

Phillip

 If I see a message that
comes out of this:

	printf("%s missing\n", obj_name);

and it looks like

	"a\nb" missing

how do I tell if it is complaining about the object the user named
with a three-byte string (i.e. lowercase-A, newline, lowercase-B),
or a six-byte string (i.e. dq, lowercase-A, bs, lowercase-N,
lowercase-B, dq)?

If we were forbidding '"' to appear in a refname, then we could take
advantage of the fact that the name of an object inside a tree at a
funny path would not start with '"', to disambiguate.  For the
three- and six-byte string cases above, the formatting function will
give these messages (referred to as "sample output" below):

	"master:a\nb" missing
	master:"a\nb" missing

because of your "we do not exactly do our standard c-quote; we
exempt dq and bs from the bytes to be quoted" rule.

But it still feels a bit misleading.  This codepath may have the
whole objectname as a single string so that c-quoting the entire
"<commit> <colon> <path>" inside a single c-quoted string that
begins with a dq is easy, but not all codepaths are lucky and some
may have to show <commit> and <path> separately, concatenated with
<colon> at the outermost output layer, which means that the second
one from the sample output may still mean the path with three-byte
name in the tree of 'master' commit.

And worse yet, because

	git branch '"master'

is possible (even though nobody sane would do that), so "treat the
string as c-quoted only if the object name as a whole begins with a
dq", this disambiguation idea would not work.  The first one from
the sample output could be the blob at the path with a five-byte
string name (i.e. lowercase-A, bs, lowercase-N, lowercase-B, dq)
in the tree of the commit at the tip of branch with seven-byte
string name (i.e. dq followed by 'master').

So, I dunno.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux