On Tue, May 24, 2016 at 06:44:26AM +0200, Torsten Bögershausen wrote: > On 05/23/2016 11:30 PM, Junio C Hamano wrote: > > Torsten Bögershausen <tboegi@xxxxxx> writes: > > > > > > > > get_host_and_port(&ssh_host, &port); > > > > > > + /* get_host_and_port may not return a port > > > > > > even when > > > > > > + * there is one: In the [host:port]:path case, > > > > > > + * get_host_and_port is called with "[host:port]" and > > > > > > + * returns "host:port" and NULL. > > > > > > + * In that specific case, we still need to split the > > > > > > + * port. */ > > > > > Is it worth to mention that this case is "still supported legacy" ? > > > > If it's worth mentioning anywhere, it seems to me it would start with > > > > urls.txt? > > > > > > > > Mike > > > > > > > I don't know. > > > urls.txt is for Git users, and connect.c is for Git developers. > > > urls.txt does not mention that Git follows any RFC when parsing the > > > URLS', it doesn't claim to be 100% compliant. > > > Even if it makes sense to do so, as many user simply expect Git to accept > > > RFC compliant URL's, and it makes the development easier, if there is > > > an already > > > written specification, that describes all the details. > > > The parser is not 100% RFC compliant, one example: > > > - old-style usgage like "git clone [host:222]:~/path/to/repo are supported > > Is it an option to fix get_host_and_port() so that it returns what > > the caller expects even when it is given "[host:port]"? When the > > caller passes other forms like "host:port", it expects to get "host" > > and "port" parsed out into two variables. Why can't the caller > > expect to see the same happen when feeding "[host:port]" to the > > function? > > > This is somewhat out of my head: > git clone git://[example.com:123]:/test #illegal, malformated URL > git clone [example.com:123]:/test #scp-like URL, legacy > git clone ssh://[example.com:123]:/test #illegal, but supported as > legacy, because > git clone ssh://[user@::1]/test # was the only way to > specify a user name at a literal IPv6 address > > May be we should have some more test cases for malformated git:// URLs? None of these malformed urls are rejected with or without my series applied: Without: $ git fetch-pack --diag-url git://[example.com:123]:/test Diag: url=git://[example.com:123]:/test Diag: protocol=git Diag: hostandport=[example.com:123]: Diag: path=/test $ git fetch-pack --diag-url ssh://[example.com:123]:/test Diag: url=ssh://[example.com:123]:/test Diag: protocol=ssh Diag: userandhost=example.com Diag: port=123 Diag: path=/test With: $ git fetch-pack --diag-url git://[example.com:123]:/test Diag: url=git://[example.com:123]:/test Diag: protocol=git Diag: user=NULL Diag: host=example.com Diag: port=123 Diag: path=/test $ git fetch-pack --diag-url ssh://[example.com:123]:/test Diag: url=ssh://[example.com:123]:/test Diag: protocol=ssh Diag: user=NULL Diag: host=example.com Diag: port=123 Diag: path=/test Note in the first case, hostandport is "[example.com:123]:", and that is treated as host=example.com:123 and port=NULL further down, so my series is changing something here, but arguably, it makes it less worse. (note that both with and without my series, "git://[example.com:123]:42/path" is treated the same, so only a corner case changed) Can we go forward with the current series (modulo the comment style thing Eric noted, and maybe adding a note about the parser handling urls as per urls.txt), and not bloat scope it? If anything, the state of the code after the series should make further parser changes easier. Cheers, Mike -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html