On 05/05/2015 09:21 PM, Jeff King wrote: > On Sat, May 02, 2015 at 07:19:28AM +0200, Michael Haggerty wrote: > >> 100 ms seems to be considered an acceptable delay between the time that >> a user, say, clicks a button and the time that the button reacts. What >> we are talking about is the time between the release of a lock by one >> process and the resumption of another process that was blocked waiting >> for the lock. The former is probably not under the control of the user >> anyway, and perhaps not even observable by the user. Thus I don't think >> that a perceivable delay between that event and the resumption of the >> blocked process would be annoying. The more salient delay is between the >> time that the user started the blocked command and when that command >> completed. Let's look in more detail. > > Yeah, you can't impact when the other process will drop the lock, but if > we assume that it takes on the order of 100ms for the other process to > do its whole operation, then on average we experience half that. And > then tack on to that whatever time we waste in sleep() after the other > guy drops the lock. And that's on average half of our backoff time. > > So something like 100ms max backoff makes sense to me, in that it keeps > us in the same order of magnitude as the expected time that the lock is > held. [...] I don't understand your argument. If another process blocks us for on the order of 100 ms, the backoff time (reading from my table) is less than half of that. It is only if another process blocks us for longer that our backoff times grow larger than 100 ms. I don't see the point of comparing those larger backoff numbers to hypothetical 100 ms expected blocking times when the larger backoffs *can only happen* for larger blocking times [1]. But even aside from bikeshedding about which backoff algorithm might be a tiny bit better than another, let's remember that these locking conflicts are INCREDIBLY RARE in real life. Current git doesn't have any retry at all, but users don't seem to be noticeably upset. In a moment I will submit a re-roll, changing the test case to add the "wait" that Johannes suggested but leaving the maximum backoff time unchanged. If anybody feels strongly about changing it, go ahead and do so (or make it configurable). I like the current setting because I think it makes more sense for servers, which is the only environment where lock contention is likely to occur with any measurable frequency. Michael [1] For completeness, let's also consider a difference scenario: Suppose the blocking is not being caused by a single long-lived process but rather by many short-lived processes running one after the other. In that case the time we spend blocking depends more on the duty cycle of other blocking processes, so our backoff time could grow to be longer than the mean time that any single process holds the lock. But in this scenario we are throughput-limited rather than latency limited, so our success in acquiring the lock sooner only deprives another process of the lock, not significantly improving the throughput of the system as a whole. (And given that the other processes are probably following the same rules as we are, the shorter backoff times are just as often helping them snatch the lock from us as us from them.) -- Michael Haggerty mhagger@xxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html