On 05/26/2016 01:34 AM, Mike Hommey wrote:
On Tue, May 24, 2016 at 06:44:26AM +0200, Torsten Bögershausen wrote:
On 05/23/2016 11:30 PM, Junio C Hamano wrote:
Torsten Bögershausen <tboegi@xxxxxx> writes:
get_host_and_port(&ssh_host, &port);
+ /* get_host_and_port may not return a port
even when
+ * there is one: In the [host:port]:path case,
+ * get_host_and_port is called with "[host:port]" and
+ * returns "host:port" and NULL.
+ * In that specific case, we still need to split the
+ * port. */
Is it worth to mention that this case is "still supported legacy" ?
If it's worth mentioning anywhere, it seems to me it would start with
urls.txt?
Mike
I don't know.
urls.txt is for Git users, and connect.c is for Git developers.
urls.txt does not mention that Git follows any RFC when parsing the
URLS', it doesn't claim to be 100% compliant.
Even if it makes sense to do so, as many user simply expect Git to accept
RFC compliant URL's, and it makes the development easier, if there is
an already
written specification, that describes all the details.
The parser is not 100% RFC compliant, one example:
- old-style usgage like "git clone [host:222]:~/path/to/repo are supported
Is it an option to fix get_host_and_port() so that it returns what
the caller expects even when it is given "[host:port]"? When the
caller passes other forms like "host:port", it expects to get "host"
and "port" parsed out into two variables. Why can't the caller
expect to see the same happen when feeding "[host:port]" to the
function?
This is somewhat out of my head:
git clone git://[example.com:123]:/test #illegal, malformated URL
git clone [example.com:123]:/test #scp-like URL, legacy
git clone ssh://[example.com:123]:/test #illegal, but supported as
legacy, because
git clone ssh://[user@::1]/test # was the only way to
specify a user name at a literal IPv6 address
May be we should have some more test cases for malformated git:// URLs?
None of these malformed urls are rejected with or without my series
applied:
Without:
$ git fetch-pack --diag-url git://[example.com:123]:/test
Diag: url=git://[example.com:123]:/test
Diag: protocol=git
Diag: hostandport=[example.com:123]:
Diag: path=/test
$ git fetch-pack --diag-url
ssh://[example.com:123]:/test
Diag: url=ssh://[example.com:123]:/test
Diag: protocol=ssh
Diag: userandhost=example.com
Diag: port=123
Diag: path=/test
With:
$ git fetch-pack --diag-url git://[example.com:123]:/test
Diag: url=git://[example.com:123]:/test
Diag: protocol=git
Diag: user=NULL
Diag: host=example.com
Diag: port=123
Diag: path=/test
$ git fetch-pack --diag-url ssh://[example.com:123]:/test
Diag: url=ssh://[example.com:123]:/test
Diag: protocol=ssh
Diag: user=NULL
Diag: host=example.com
Diag: port=123
Diag: path=/test
Note in the first case, hostandport is "[example.com:123]:", and that
is treated as host=example.com:123 and port=NULL further down, so my
series is changing something here, but arguably, it makes it less worse.
(note that both with and without my series,
"git://[example.com:123]:42/path" is treated the same, so only a corner
case changed)
Can we go forward with the current series (modulo the comment style
thing Eric noted, and maybe adding a note about the parser handling urls
as per urls.txt), and not bloat scope it? If anything, the state of the
code after the series should make further parser changes easier.
Cheers,
Mike
Thanks for digging.
How about something like this:
/*
* get_host_and_port may not return a port in the [host:port]:path case.
* To support this undocumented legacy we still need to split the port.
*/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html