Re: [PATCH v3 34/35] remote-curl: implement stateless-connect command

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Brandon Williams wrote:

> Teach remote-curl the 'stateless-connect' command which is used to
> establish a stateless connection with servers which support protocol
> version 2.  This allows remote-curl to act as a proxy, allowing the git
> client to communicate natively with a remote end, simply using
> remote-curl as a pass through to convert requests to http.

Cool!  I better look at the spec for that first.

*looks at the previous patch*

Oh, there is no documented spec. :/  I'll muddle through this instead,
then.

[...]
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -188,7 +188,10 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
>  	heads->version = discover_version(&reader);
>  	switch (heads->version) {
>  	case protocol_v2:
> -		die("support for protocol v2 not implemented yet");
> +		/*
> +		 * Do nothing.  Client should run 'stateless-connect' and
> +		 * request the refs themselves.
> +		 */
>  		break;

This is the 'list' command, right?  Since we expect the client to run
'stateless-connect' instead, can we make it error out?

[...]
> @@ -1082,6 +1085,184 @@ static void parse_push(struct strbuf *buf)
>  	free(specs);
>  }
>  
> +struct proxy_state {
> +	char *service_name;
> +	char *service_url;
> +	struct curl_slist *headers;
> +	struct strbuf request_buffer;
> +	int in;
> +	int out;
> +	struct packet_reader reader;
> +	size_t pos;
> +	int seen_flush;
> +};

Can this have a comment describing what it is/does?  It's not obvious
to me at first glance.

It doesn't have to go in a lot of detail since this is an internal
implementation detail, but something saying e.g. that this represents
a connection to an HTTP server (that's an example; I'm not saying
that's what it represents :)) would help.

> +
> +static void proxy_state_init(struct proxy_state *p, const char *service_name,
> +			     enum protocol_version version)
[...]
> +static void proxy_state_clear(struct proxy_state *p)

Looks sensible.

[...]
> +static size_t proxy_in(char *buffer, size_t eltsize,
> +		       size_t nmemb, void *userdata)

Can this have a comment describing the interface?  (E.g. does it read
a single pkt_line?  How is the caller expected to use it?  Does it
satisfy the interface of some callback?)

libcurl's example https://curl.haxx.se/libcurl/c/ftpupload.html just
calls this read_callback.  Such a name plus a pointer to
CURLOPT_READFUNCTION should do the trick; bonus points if the comment 
says what our implementation of the callback does.

Is this about having peek ability?

> +{
> +	size_t max = eltsize * nmemb;

Can this overflow?  st_mult can avoid having to worry about that.

> +	struct proxy_state *p = userdata;
> +	size_t avail = p->request_buffer.len - p->pos;
> +
> +	if (!avail) {
> +		if (p->seen_flush) {
> +			p->seen_flush = 0;
> +			return 0;
> +		}
> +
> +		strbuf_reset(&p->request_buffer);
> +		switch (packet_reader_read(&p->reader)) {
> +		case PACKET_READ_EOF:
> +			die("unexpected EOF when reading from parent process");
> +		case PACKET_READ_NORMAL:
> +			packet_buf_write_len(&p->request_buffer, p->reader.line,
> +					     p->reader.pktlen);
> +			break;
> +		case PACKET_READ_DELIM:
> +			packet_buf_delim(&p->request_buffer);
> +			break;
> +		case PACKET_READ_FLUSH:
> +			packet_buf_flush(&p->request_buffer);
> +			p->seen_flush = 1;
> +			break;
> +		}
> +		p->pos = 0;
> +		avail = p->request_buffer.len;
> +	}
> +
> +	if (max < avail)
> +		avail = max;
> +	memcpy(buffer, p->request_buffer.buf + p->pos, avail);
> +	p->pos += avail;
> +	return avail;

This is a number of bytes, but CURLOPT_READFUNCTION expects a number
of items, fread-style.  That is:

	if (avail < eltsize)
		... handle somehow, maybe by reading in more? ...

	avail_memb = avail / eltsize;
	memcpy(buffer,
	       p->request_buffer.buf + p->pos,
	       st_mult(avail_memb, eltsize));
	p->pos += st_mult(avail_memb, eltsize);
	return avail_memb;

But https://curl.haxx.se/libcurl/c/CURLOPT_READFUNCTION.html says

	Your function must then return the actual number of bytes that
	it stored in that memory area.

Does this mean eltsize is always 1?  This is super confusing...

... ok, a quick grep for fread_func in libcurl reveals that eltsize is
indeed always 1.  Can we add an assertion so we notice if that
changes?

	if (eltsize != 1)
		BUG("curl read callback called with size = %zu != 1", eltsize);
	max = nmemb;

[...]
> +static size_t proxy_out(char *buffer, size_t eltsize,
> +			size_t nmemb, void *userdata)
> +{
> +	size_t size = eltsize * nmemb;
> +	struct proxy_state *p = userdata;
> +
> +	write_or_die(p->out, buffer, size);
> +	return size;
> +}

Nice.  Same questions about st_mult or just asserting on eltsize apply
here, too.

[...]
> +static int proxy_post(struct proxy_state *p)

What does this function do?  Can it get a brief comment?

> +{
> +	struct active_request_slot *slot;
> +	int err;
> +
> +	slot = get_active_slot();
> +
> +	curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
> +	curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
> +	curl_easy_setopt(slot->curl, CURLOPT_URL, p->service_url);
> +	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, p->headers);
> +
> +	/* Setup function to read request from client */
> +	curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, proxy_in);
> +	curl_easy_setopt(slot->curl, CURLOPT_READDATA, p);
> +
> +	/* Setup function to write server response to client */
> +	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, proxy_out);
> +	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, p);
> +
> +	err = run_slot(slot, NULL);
> +
> +	if (err != HTTP_OK)
> +		err = -1;
> +
> +	return err;

HTTP_OK is 0 but kind of obscures that.  How about something like the
following?

	if (run_slot(slot, NULL))
		return -1;
	return 0;

or

	if (run_slot(slot, NULL) != HTTP_OK)
		return -1;
	return 0;

That way it's clearer that this always returns 0 or -1.

[...]
> +static int stateless_connect(const char *service_name)
> +{
> +	struct discovery *discover;
> +	struct proxy_state p;
> +
> +	/*
> +	 * Run the info/refs request and see if the server supports protocol
> +	 * v2.  If and only if the server supports v2 can we successfully
> +	 * establish a stateless connection, otherwise we need to tell the
> +	 * client to fallback to using other transport helper functions to
> +	 * complete their request.
> +	 */
> +	discover = discover_refs(service_name, 0);
> +	if (discover->version != protocol_v2) {
> +		printf("fallback\n");
> +		fflush(stdout);
> +		return -1;

Interesting.  I wonder if we can make remote-curl less smart and drive
this more from the caller.

E.g. if the caller could do a single stateless request, they could do:

	option git-protocol version=2
	stateless-request GET info/refs?service=git-upload-pack
	[pkt-lines, ending with a flush-pkt]

The git-protocol option in this hypothetical example is the value to
be passed in the Git-Protocol header.

Then based on the response, the caller could decide to keep using
stateless-request for further requests or fall back to "fetch".

That way, if we implement some protocol v3, the remote helper would
not have to be changed at all to handle it: the caller would instead
make the new v3-format request and remote-curl would be able to oblige
without knowing why they're doing it.

What do you think?

Thanks,
Jonathan



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux