On 11/16/2017 9:14 PM, Junio C Hamano wrote:
Jeff King <peff@xxxxxxxx> writes:
Those encodings don't necessarily need to be the same, because they're
about transport. Inside each process we'd have the raw bytes, and encode
them as appropriate to whatever sub-program we're going to pass to (or
not at all if we skip the shell for sub-processes, which is usually a
good idea).
Yes, I share the same feeling. It does not help that the series
defines its own notion of arg_needs_armor() and uses it to set a
field called requires_armor that is not yet used, the definition of
"armor"ing being each byte getting encoded as two hexadecimal digits
without any sign (which makes me wonder what a receiver of
"deadbeef" would do---did it receive an armored string or a plain
one???). I do not understand why these strings are not passed as
opaque sequences of bytes and instead converted at this low a layer.
I'm probably being too paranoid. My fear is that a client could pass
an expression to clone/fetch/fetch-pack that would be sent to the
server and evaluated by the interface between upload-pack and pack-objects.
I'm not worried about the pack-protocol transport. I'm mainly concerned
in how upload-pack passes that *client-expression* to pack-objects and are
there ways for that to go south on the server with a carefully crafted
expression.
Even if we assume that upload-pack on the server directly invokes
pack-objects (rather than a shell), there still might be issues.
For platforms like Linux which have a native execve() and can pass
args in an argv (and which the sub-process also receives in an argv
in their main()), my paranoia is probably overkill.
But on Windows, where the native interface takes a command-line string
rather than an argv, I was concerned. Yes, there is code in compat/mingw.c
to quote args when building a command line from an argv (and I'm *not*
saying there are bugs in that), but again maybe I am being paranoid.
I'll take another look and the existing quoting mechanisms and re-eval.
Thanks,
Jeff