Am 10.04.2017 um 15:38 schrieb Jeff King:
Are those bugs? Maybe. Certainly they are limitations. But are they ones
anybody _cares_ about? I think this may fall under "if it hurts, don't
do it".
It's not always possible to avoid that.
URLs, for example, may contain "funny characters", including multi-byte
characters of which the second byte is 0x0a. If they are guaranteed to
always be URL-encoded this isn't a problem, but then we still need to
make sure that URL-encoding does happen.
Next source of funny characters that comes to my mind is submodules.
They derive their name from the URL by default, and the subdirectory
name as well. Again, consider the multibyte name where the second
character is 0x0a. Or 0x80: À (uppercase A with accent grave) happens to
have that byte in UTF-8 encoding, Ẁ is U+1E80 which would be encoded as
0x80 0x1e on an NTFS filesystem (barring additional coding steps in APIs
or webservices, which further complicate the situation but don't usually
eliminate the problem, they just shift it around).
> If there are security bugs where a malicious input can cause us
to do something bad, that's something to care about. But that's very
different than asking "do these tests run to completion with a funny
input".
If the tests do not complete, git is doing something unexpected.
That in itself is not a security hole, but there's a pretty good chance
that at least one of these ~230 unexpected things can be turned into
one, given enough time and motivation. The risk multiplies as this is
shell scripting, where the path from "string is misinterpreted" to
"string is run as a command" is considerably shorter than in other
languages.