Re: [PATCH v2] http.c: use CURLOPT_RANGE for range requests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 30, 2015 at 06:54:42PM -0400, David Turner wrote:

> A HTTP server is permitted to return a non-range response to a HTTP
> range request (and Apache httpd in fact does this in some cases).
> While libcurl knows how to correctly handle this (by skipping bytes
> before and after the requested range), it only turns on this handling
> if it is aware that a range request is being made.  By manually
> setting the range header instead of using CURLOPT_RANGE, we were
> hiding the fact that this was a range request from libcurl.  This
> could cause corruption.

The goal makes sense. Why weren't we using CURLOPT_RANGE before? Did it
not exist (or otherwise have limitations) in 2005, and if so, when did
it become usable? Do we need to protect this with an #ifdef for the curl
version?

> @@ -1213,8 +1213,9 @@ static int http_request(const char *url,
>  			curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
>  					 fwrite);
>  			if (posn > 0) {
> -				strbuf_addf(&buf, "Range: bytes=%ld-", posn);
> -				headers = curl_slist_append(headers, buf.buf);
> +				strbuf_addf(&buf, "%ld-", posn);
> +				curl_easy_setopt(slot->curl, CURLOPT_RANGE,
> +						 &buf.buf);
>  				strbuf_reset(&buf);
>  			}
>  		} else

We reuse curl slots from request to request, and curl options may
persist.  The old code appended to the header list, which then gets
added to the curl object with CURLOPT_HTTPHEADER. Subsequent requests,
even if they are not using ranges, would also set CURLOPT_HTTPHEADER,
possibly to NULL, so the range header would not accidentally creep into
the next request.

But with CURLOPT_RANGE, I think the value would persist to the next
request (unless curl automagically forgets it after making the request,
but I don't think it generally does so). I think the cleanest thing
would be to reset it back to NULL in get_active_slot (there are several
other fields that we reset there, too).

> @@ -1685,7 +1680,6 @@ struct http_object_request *new_http_object_request(const char *base_url,
>  	ssize_t prev_read = 0;
>  	long prev_posn = 0;
>  	char range[RANGE_HEADER_SIZE];
> -	struct curl_slist *range_header = NULL;
>  	struct http_object_request *freq;

Junio asked elsewhere if we really need to tweak RANGE_HEADER_SIZE. But
I think we should go even further, and bump the definition much closer
to the point of use[1], and not worry about an arbitrary constant. After
all, we already use sizeof(), so we do not even end up repeating the
constant.

We could even hide the whole thing away with something like:

  void http_set_range(CURL *curl, long lo, long hi)
  {
	char buf[128];
	int len = 0;

	if (lo >= 0)
		len += xsnprintf(buf + len, "%ld", lo);
	len += xsnprintf(buf + len, "-");
	if (hi >= 0)
		len += xsnprintf(buf + len, "%ld", hi);

	curl_easy_setopt(curl, CURLOPT_RANGE, buf);
  }

That would also make it easier to replace if we do need to keep an
#ifdef for older versions of curl. But maybe it is just
over-engineering.

-Peff

[1] Antique versions of curl needed the buffer to remain valid through
    the whole request, but these days it makes its own copy (and even
    under the old regime, I am not sure if the existing code would work
    anyway, as we return the request from this function, and the buffer
    is function-local).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]