On Tue, Oct 13, 2020 at 11:45:02PM +0000, brian m. carlson wrote: > Yeah, I was about to mention the same thing. It looks like we cover > only a subset of requests. Moreover, I think this feature is going to > practically fail in some cases and we need to either document that > clearly or abandon this effort. > > In remote-curl.c, we have post_rpc, which does a POST request to upload > data for a push. However, if the data is larger than the buffer, we > stream it using chunked transfer-encoding. Because we're reading from a > pipe, that data cannot be retried: the pack-objects stream will have > ended. Right, this is the large_request code path there. We use the same function for fetches, too, though perhaps it's less likely that a negotiation request will exceed the post buffer size. We do send a probe rpc in this case, which lets us handle an HTTP 401 auth request. We _could_ retry on errors to the probe rpc, but I'm not sure if it really accomplishes that much. The interesting thing is whether the actual request with content goes through. If retrying magically fixes things, there's no reason to think that the actual request is any less likely to intermittently fail than the probe request (in fact I'd expect it to fail more). It would be possible to restart even these large requests. Obviously we could spool the contents to disk in order to replay it. But that carries a cost for the common case that we either succeed or definitely fail on the first case, and never use the spooled copy. Another option is to simply restart the Git process that is generating the data that we're piping. But that has to happen outside of post_rpc(); only the caller knows the right way to invoke that again. And we kind of already have that: just run the Git command again. I know that sounds a little dismissive, but it has really been our recommended retry strategy for ages[1]. So I'd wonder in Sean's use case why just restarting the whole Git process isn't a workable solution. It wouldn't respect Retry-After, but it might be reasonable to surface that header's value to the caller so it can act appropriately (and I guess likewise whether we saw an error that implies retrying might work). All of this is written from the perspective of v1. In v2, we do a lot more blind packet-shuffling (see remote-curl.c:stateless_connect()). I suspect it makes any kind of retry at the level of the http code much harder. Whereas just restarting the Git command would probably work just fine. -Peff [1] I think git-submodule will retry failed clones, for example. TBH, I have never once seen this retry accomplish anything, and it only wastes time and makes the output more confusing (since we see the failure twice). I have to admit I'm not thrilled to see more blind retrying for that reason.