Re: [PATCH] Retry HTTP requests on SSL connect failures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 01, 2012 at 02:23:06PM -0700, Shawn O. Pearce wrote:

> From: "Shawn O. Pearce" <spearce@xxxxxxxxxxx>
> 
> When libcurl fails to connect to an SSL server always retry the
> request once. Since the connection failed before the HTTP headers
> can be sent, no data has exchanged hands, so the remote side has
> not learned of the request and will not perform it twice.
> 
> In the wild we have seen git-remote-https fail to connect to
> some load-balanced SSL servers sporadically, while modern popular
> browsers (e.g. Firefox and Chromium) have no trouble with the same
> server pool.
> 
> Lets assume the site operators (Hi Google!) have a clue and are
> doing everything they already can to ensure secure, successful
> SSL connections from a wide range of HTTP clients. Implementing a
> single level of retry in the client can make it more robust against
> transient failure modes.

I find this a little distasteful just because we haven't figured out the
actual _reason_ for the failure. That is, I'm not convinced this isn't
something that curl or the ssl library can't handle internally if we
would only configure them correctly. Did you ever follow up on tweaking
the session caching options for curl?

Have you tried running your fails-after-a-few-hours request with other
clients that don't have the problem and seeing what they do (I'm
thinking a small webkit harness or something would be the most
feasible)?

That being said, you did make it so that it only kicks in during ssl
connect errors:

> +	for (attempts = 0; attempts < 2; attempts++) {
> +		if (start_active_slot(slot)) {
> +			run_active_slot(slot);
> +			if (slot->results->curl_result == CURLE_SSL_CONNECT_ERROR)
> +				continue;
> +			ret = handle_curl_result(slot);
> +		} else {
> +			error("Unable to start HTTP request for %s", url);
> +			ret = HTTP_START_FAILED;
> +		}
> +		break;

which means it shouldn't really be affecting the general populace. So
even though it feels like a dirty hack, at least it is self-contained,
and it does fix a real-world problem. If your answer to the above
questions is "hunting this further is just not worth the effort", I can
live with that.

> diff --git a/remote-curl.c b/remote-curl.c
> index a269608..04a379c 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -353,6 +353,8 @@ static int run_slot(struct active_request_slot *slot)
>  
>  	slot->results = &results;
>  	slot->curl_result = curl_easy_perform(slot->curl);
> +	if (slot->curl_result == CURLE_SSL_CONNECT_ERROR)
> +		slot->curl_result = curl_easy_perform(slot->curl);
>  	finish_active_slot(slot);

How come the first hunk gets a nice for-loop and this one doesn't?

Also, are these hunks the only two spots where this error can come up?
The first one does http_request, which handles smart-http GET requests.
the second does run_slot, which handles smart-http POST requests.

Some of the dumb http fetches will go through http_request. But some
will not. And I think almost none of dumb http push will.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]