Re: [PATCH 0/3] [GSOC][RFC] ref-filter: add contents:raw atom

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

>> git for-each-ref --format="%(contents)" --python refs/mytrees/first
>>
>> will output a string processed by python_quote_buf_with_size(), which
>> contains'\0'. But the binary files seem to be useless after quoting. Should
>> we allow these binary files to be output in the default way with
>> strbuf_add()? If so, we can remove the first patch.
>
> The --language option is designed to be used to write a small script
> in the language and used like this:
>
>     git for-each-ref --format='
> 		name=%(refname)
> 		var=%(placeholder)
>                 mkdir -p "$(dirname "$name")"
> 		printf "%%s" "$var" >"$name"
>     ' --shell | /bin/sh
>
> Note that %(refname) and %(placeholder) in the --format string is
> not quoted at all; the "--shell" option knows how values are quoted
> in the host language (shell) and writes single-quotes around
> %(refname).  If %(placeholder) produces something with a single-quote
> in it, that will (eh, at least "should") be quoted appropriately.
>
> So It does not make any sense not to quote a value that comes from
> %(placeholder), whether it is binary or not, to match the syntax of
> the host language you are making the "for-each-ref --format=" to
> write such a script in.
>
> So, "binary files seem to be useless after quoting" is a
> misunderstanding.  They are useless if you do not quote them.

Another thing to keep in mind is that not all host languages may be
capable of expressing a string with NUL in it.  Most notably, shell.
The --shell quoting rule used by for-each-ref would produce an
equivalent of the "script" produced like this:

	$ tr Q '\000' >script <<\EOF
	#!/bin/sh
	varname='varQname'
	echo "$varname"
	EOF

but I do not think it would say 'var' followed by a NUL followed by
'name'.  The NUL is likely lost when assigned to the variable.

So for some host languages, binaries may be useless with or without
quoting.  But for ones that can use strings to hold arbitrary byte
sequence, it should be OK to let for-each-ref to quote the byte
sequence as a string literal for the language (so that the exact
byte sequence will end up being in the variable after assignment).

That reminds me of another thing.  The --python thing was written
back when Python3 was still a distant dream and strings were the
appropriate type for a random sequence of bytes (as opposed to
unicode, which cannot have a random sequence of bytes).  Somebody
needs to check if it needs any update to work with Python3.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux