Since 9d2e942 (decode file:// and ssh:// URLs, 2010-05-23) the URL logic unquotes escaped URLs. For the %2B type of escape, this is conformant with RFC 2396. However, it also unquotes + into a space character, which is only appropriate for the query strings in HTTP. This notably broke fetching from the gtk+ repository. We cannot just remove the corresponding code since the same url_decode_internal() is also used by the HTTP backend to decode query parameters. Introduce a new argument that controls whether the + decoding happens, and use it only in the (client-side) url_decode(). Reported-by: Jasper St. Pierre <jstpierre@xxxxxxxxxxx> Signed-off-by: Thomas Rast <trast@xxxxxxxxxxxxxxx> --- I wrote: > Junio C Hamano wrote: > > > > http-backend.c::get_info_refs() > > -> http-backend.c::get_parameter() > > -> http-backend.c::get_parameters() > > -> url.c::url_decode_parameter_value() > > -> url.c::url_decode_internal() > > You're right, I forgot about those. I imagine it would be one of two > cases: [...] > Shawn, can you help with this? The third case, of course, is: * It only uses these functions for parameter decoding, which of course was correct to begin with. So after hopefully drinking enough coffee, I made this one. The catch is that I'm not entirely clear whether *not* decoding the + client-side anywhere in the URL is correct for http:// URLs? If the client decodes and re-encodes the URL, then the + would be turned into a %2B on the re-encoding. Then again maybe UI-facing URLs should never have a query part at all? t/t5601-clone.sh | 10 ++++++++-- url.c | 11 ++++++----- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index 8abb71a..4431dfd 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -178,8 +178,14 @@ test_expect_success 'clone respects global branch.autosetuprebase' ' test_expect_success 'respect url-encoding of file://' ' git init x+y && - test_must_fail git clone "file://$PWD/x+y" xy-url && - git clone "file://$PWD/x%2By" xy-url + git clone "file://$PWD/x+y" xy-url-1 && + git clone "file://$PWD/x%2By" xy-url-2 +' + +test_expect_success 'do not query-string-decode + in URLs' ' + rm -rf x+y && + git init "x y" && + test_must_fail git clone "file://$PWD/x+y" xy-no-plus ' test_expect_success 'do not respect url-encoding of non-url path' ' diff --git a/url.c b/url.c index 2306236..cd8f74f 100644 --- a/url.c +++ b/url.c @@ -67,7 +67,8 @@ static int url_decode_char(const char *q) return val; } -static char *url_decode_internal(const char **query, const char *stop_at, struct strbuf *out) +static char *url_decode_internal(const char **query, const char *stop_at, + struct strbuf *out, int decode_plus) { const char *q = *query; @@ -90,7 +91,7 @@ static char *url_decode_internal(const char **query, const char *stop_at, struct } } - if (c == '+') + if (decode_plus && c == '+') strbuf_addch(out, ' '); else strbuf_addch(out, c); @@ -110,17 +111,17 @@ char *url_decode(const char *url) strbuf_add(&out, url, colon - url); url = colon; } - return url_decode_internal(&url, NULL, &out); + return url_decode_internal(&url, NULL, &out, 0); } char *url_decode_parameter_name(const char **query) { struct strbuf out = STRBUF_INIT; - return url_decode_internal(query, "&=", &out); + return url_decode_internal(query, "&=", &out, 1); } char *url_decode_parameter_value(const char **query) { struct strbuf out = STRBUF_INIT; - return url_decode_internal(query, "&", &out); + return url_decode_internal(query, "&", &out, 1); } -- 1.7.2.278.g76edd.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html