Re: [PATCH v2] http: do not ignore proxy path

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jul 27, 2024 at 06:44:42AM +0000, Ryan Hendrickson via GitGitGadget wrote:

>     Re earlier discussion about regressions, it turns out that curl has only
>     supported this syntax since 2022 [https://curl.se/ch/7.84.0.html].
>     Earlier versions of curl, if provided an http_proxy that contained a
>     path, would also ignore it. curl didn't seem to consider this a breaking
>     change when the feature was introduced
>     [https://github.com/curl/curl/pull/8668], though; is that a valid
>     argument for Git not to lose sleep over it either?

IMHO it is probably OK. The path did not do anything before then, so why
would anybody pass it?

Looking into the curl support, I did notice two things:

  - unix sockets are only supported for socks proxies. Is it meaningful
    to have a path on an http proxy? I don't think so (there is no place
    to send it in the "GET" line). But perhaps we should limit the new
    code only to socks.

  - curl says you should put "localhost" in the host field for this,
    though obviously it is not otherwise used. Should we likewise
    require that to kick in the code?

> @@ -1265,7 +1266,24 @@ static CURL *get_curl_handle(void)
>  		if (!proxy_auth.host)
>  			die("Invalid proxy URL '%s'", curl_http_proxy);
>  
> -		curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host);
> +		strbuf_addstr(&proxy, proxy_auth.host);
> +		if (proxy_auth.path) {

There's one gotcha here I wondered about: we usually throw away the path
element, unless credential.useHTTPPath is set. That only happens after
we load the per-credential config, though, via credential_apply_config(),
which in turn is triggered by a call to credential_fill(), etc.

I think this is OK here, since we have just called credential_from_url()
ourselves, and the earliest credential_fill() we'd hit is from
init_curl_proxy_auth(), which is called right after the code you're
touching.

> +			int path_is_supported = 0;
> +			/* curl_version_info was added in curl 7.10 */
> +#if LIBCURL_VERSION_NUM >= 0x070a00
> +			curl_version_info_data *ver = curl_version_info(CURLVERSION_NOW);
> +			path_is_supported = ver->version_num >= 0x075400;
> +#endif

We've been trying to keep our curl version #ifdefs in git-curl-compat.h,
with human-readable names. But in this case, I think we can ditch it
entirely, as we require curl 7.21.3 or later already.

This will be the first time we check curl_version_info() instead of a
build-time option. I think that makes sense here. Most features require
extra information at build-time (e.g., CURLOPT_* constants), but in this
case it is purely a check on the runtime behavior.

We perhaps could get away with just letting an older curl quietly ignore
the path field, but what you have here is probably a bit friendlier for
users.

> diff --git a/t/socks5-proxy-server/src/socks5.pl b/t/socks5-proxy-server/src/socks5.pl
> new file mode 100755
> index 00000000000..3ef66a1a287
> --- /dev/null
> +++ b/t/socks5-proxy-server/src/socks5.pl

This is a lot more code than I was hoping for. There are definitely
parts of it we don't care about (e.g, threading, handling multiple
connections at once) that could be pared down a bit. I don't think we
particularly care about auth either (we just want to make sure we talk
to the unix-socket proxy at all).

What if we switched to socks4, which drops all of the auth bits and
supports only IPs, not hostnames (that came in socks4a). This is the
shortest I could come up with (only lightly tested):

-- >8 --
use strict;
use IO::Select;
use IO::Socket::UNIX;
use IO::Socket::INET;

my $path = shift;

unlink($path);
my $server = IO::Socket::UNIX->new(Listen => 1, Local => $path)
	or die "unable to listen on $path: $!";

$| = 1;
print "ready to rumble\n";

while (my $client = $server->accept()) {
	sysread $client, my $buf, 8;
	my ($version, $cmd, $port, $ip) = unpack 'CCnN', $buf;
	next unless $version == 4; # socks4
	next unless $cmd == 1; # TCP stream connection

	# skip NUL-terminated id
	while (sysread $client, my $char, 1) {
		last unless ord($char);
	}

	# version(0), reply(5a == granted), port (ignored), ip (ignored)
	syswrite $client, "\x00\x5a\x00\x00\x00\x00\x00\x00";

	my $remote = IO::Socket::INET->new(PeerHost => $ip, PeerPort => $port)
		or die "unable to connect to $ip/$port: $!";

	my $io = IO::Select->new($client, $remote);
	while ($io->count) {
		for my $fh ($io->can_read(0)) {
			for my $pair ([$client, $remote], [$remote, $client]) {
				my ($from, $to) = @$pair;
				next unless $fh == $from;

				my $r = sysread $from, my $buf, 1024;
				if (!defined $r || $r <= 0) {
					$io->remove($from);
					next;
				}
				syswrite $to, $buf;
			}
		}
	}
}
-- >8 --

That's still more than I'd like, but quite a bit smaller. I dunno.
Probably the one you found is more battle-tested, but I really think we
have pretty simple requirements.

> diff --git a/t/t5564-http-proxy.sh b/t/t5564-http-proxy.sh
> index bb35b87071d..bafaa31adf8 100755
> --- a/t/t5564-http-proxy.sh
> +++ b/t/t5564-http-proxy.sh
> @@ -39,4 +39,22 @@ test_expect_success 'clone can prompt for proxy password' '
>  	expect_askpass pass proxuser
>  '
>  
> +start_socks() {
> +	# The %30 tests that the correct amount of percent-encoding is applied
> +	# to the proxy string passed to curl.
> +	"$PERL_PATH" "$TEST_DIRECTORY/socks5-proxy-server/src/socks5.pl" \
> +		"$TRASH_DIRECTORY/%30.sock" proxuser proxpass &
> +	socks_pid=$!
> +}
> +
> +test_lazy_prereq LIBCURL_7_84 'expr x$(curl-config --vernum) \>= x075400'

This is the first time we'd be relying on curl-config in the test suite.
I suspect it would _usually_ work, but I'd worry about a few things:

  1. The Makefile allows for a different path for $(CURL_CONFIG), or
     even skipping it entirely by setting $(CURLDIR).

  2. We'd usually build and test in the same environment, but not
     necessarily. In particular, I know Windows uses separate CI jobs
     for this, and I'm not sure if curl-config will be around in the
     test jobs.

I can see two paths forward:

  a. There's been recent discussion about adding an option for Git to
     report the run-time curl version. That doesn't exist yet, though
     "git version --build-options" will report the build-time version.
     That might be good enough for the purposes of testing a build.

  b. Write the test such that it expects the request to work _or_ we see
     the "version too old" failure.

> +test_expect_success PERL,LIBCURL_7_84 'clone via Unix socket' '

I'm not sure if this PERL prereq is sufficient. We also need to know
that we can make unix sockets. Probably we need to actually run the
socks proxy and make sure it gets to the "starting..." line before
accepting that it works. Which also gets us to...

> +	start_socks &&
> +	test_when_finished "rm -rf clone && kill $socks_pid" &&
> +	test_config_global http.proxy "socks5://proxuser:proxpass@localhost$PWD/%2530.sock" &&
> +	GIT_TRACE_CURL=$PWD/trace git clone "$HTTPD_URL/smart/repo.git" clone &&

This is racy. We started the proxy in the background, but we don't know
that it's ready to accept connections by the time we run "git clone". My
first few runs all failed, but putting a "sleep 1" in between fixed it
(which obviously is not a real fix, but just a debugging aid).

If you usually see success, try running the test script with "--stress"
to create extra load, and you should see failures.

The usual trick here is to start the process in the background, and then
in the foreground wait for some "I'm ready" output, which involves ugly
tricks with fifos. There's prior art in lib-git-daemon.sh (though it's a
little more complicated there, because we want to relay its stderr, too,
but with a script under our control we can just put the ready message on
stdout).

So coupled with the prereq fix that I mentioned above, you might be able
to do something like (totally untested):

  start_socks_proxy () {
	mkfifo socks_output &&
	{
		perl proxy.pl ... >socks_output &
		socks_pid=$!
	} &&
	read line <socks_output &&
	test "$line" = "Starting..."
  }

  test_expect_success 'try to start socks proxy' '
	if start_socks_proxy
	then
		test_seq_prereq SOCKS_PROXY
	fi
  '

  test_expect_success SOCKS_PROXY 'clone via unix socket' '
	...
  '

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux