Re: [PATCH 1/2] http: reset POSTFIELDSIZE when clearing curl handle

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 03, 2024 at 08:30:54AM +0200, Patrick Steinhardt wrote:
> On Tue, Apr 02, 2024 at 04:05:17PM -0400, Jeff King wrote:
> > In get_active_slot(), we return a CURL handle that may have been used
> > before (reusing them is good because it lets curl reuse the same
> > connection across many requests). We set a few curl options back to
> > defaults that may have been modified by previous requests.
> > 
> > We reset POSTFIELDS to NULL, but do not reset POSTFIELDSIZE (which
> > defaults to "-1"). This usually doesn't matter because most POSTs will
> > set both fields together anyway. But there is one exception: when
> > handling a large request in remote-curl's post_rpc(), we don't set
> > _either_, and instead set a READFUNCTION to stream data into libcurl.
> > 
> > This can interact weirdly with a stale POSTFIELDSIZE setting, because
> > curl will assume it should read only some set number of bytes from our
> > READFUNCTION. However, it has worked in practice because we also
> > manually set a "Transfer-Encoding: chunked" header, which libcurl uses
> > as a clue to set the POSTFIELDSIZE to -1 itself.
> > 
> > So everything works, but we're better off resetting the size manually
> > for a few reasons:
> > 
> >   - there was a regression in curl 8.7.0 where the chunked header
> >     detection didn't kick in, causing any large HTTP requests made by
> >     Git to fail. This has since been fixed (but not yet released). In
> >     the issue, curl folks recommended setting it explicitly to -1:
> > 
> >       https://github.com/curl/curl/issues/13229#issuecomment-2029826058
> > 
> >     and it indeed works around the regression. So even though it won't
> >     be strictly necessary after the fix there, this will help folks who
> >     end up using the affected libcurl versions.
> > 
> >   - it's consistent with what a new curl handle would look like. Since
> >     get_active_slot() may or may not return a used handle, this reduces
> >     the possibility of heisenbugs that only appear with certain request
> >     patterns.
> > 
> > Note that the recommendation in the curl issue is to actually drop the
> > manual Transfer-Encoding header. Modern libcurl will add the header
> > itself when streaming from a READFUNCTION. However, that code wasn't
> > added until 802aa5ae2 (HTTP: use chunked Transfer-Encoding for HTTP_POST
> > if size unknown, 2019-07-22), which is in curl 7.66.0. We claim to
> > support back to 7.19.5, so those older versions still need the manual
> > header.
> > 
> > Signed-off-by: Jeff King <peff@xxxxxxxx>
> > ---
> >  http.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/http.c b/http.c
> > index e73b136e58..3d80bd6116 100644
> > --- a/http.c
> > +++ b/http.c
> > @@ -1452,6 +1452,7 @@ struct active_request_slot *get_active_slot(void)
> >  	curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, NULL);
> >  	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, NULL);
> >  	curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, NULL);
> > +	curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE, -1L);
> >  	curl_easy_setopt(slot->curl, CURLOPT_UPLOAD, 0);
> >  	curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1);
> >  	curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 1);
> 
> Can't we refactor this code to instead use `curl_easy_reset()`? That
> function already resets most of the data we want to reset and would also
> end up setting `POSFIELDSIZE = -1` via `Curl_init_userdefined()`. So
> wouldn't the following be a more sensible fix?
> 
> diff --git a/http.c b/http.c
> index e73b136e58..e5f5bc23db 100644
> --- a/http.c
> +++ b/http.c
> @@ -1442,20 +1442,14 @@ struct active_request_slot *get_active_slot(void)
>  	slot->finished = NULL;
>  	slot->callback_data = NULL;
>  	slot->callback_func = NULL;
> +	curl_easy_reset(slot->curl);
>  	curl_easy_setopt(slot->curl, CURLOPT_COOKIEFILE, curl_cookie_file);
>  	if (curl_save_cookies)
>  		curl_easy_setopt(slot->curl, CURLOPT_COOKIEJAR, curl_cookie_file);
>  	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, pragma_header);
>  	curl_easy_setopt(slot->curl, CURLOPT_RESOLVE, host_resolutions);
>  	curl_easy_setopt(slot->curl, CURLOPT_ERRORBUFFER, curl_errorstr);
> -	curl_easy_setopt(slot->curl, CURLOPT_CUSTOMREQUEST, NULL);
> -	curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, NULL);
> -	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, NULL);
> -	curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, NULL);
> -	curl_easy_setopt(slot->curl, CURLOPT_UPLOAD, 0);
> -	curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1);
>  	curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 1);
> -	curl_easy_setopt(slot->curl, CURLOPT_RANGE, NULL);
>  
>  	/*
>  	 * Default following to off unless "ALWAYS" is configured; this gives

Oh well, the answer is "no", or at least not as easily as this, as the
failing tests tell us. I guess it resets more data than we actually want
it to reset, but I didn't dig any deeper than that.

Patrick

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux