On Tue, May 14 2019, Jeff King wrote: > On Tue, May 14, 2019 at 12:33:11PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> > I think it would work because any update-server-info, whether from A or >> > B, will take into account the full current repo state (and we don't look >> > at that state until we take the lock). So you might get an interleaved >> > "A-push, B-push, B-maint, A-maint", but that's OK. A-maint will >> > represent B's state when it runs. >> >> Maybe we're talking about different things. I mean the following >> sequence: >> >> 1. Refs "X" and "Y" are at X=A Y=A >> 2. Concurrent push #1 happens, updating X from A..F >> 3. Concurrent push #2 happens, updating Y from A..F >> 4. Concurrent push #1 succeeds >> 5. Concurrent push #1 starts update-server-info. Reads X=F Y=A >> 5. Concurrent push #2 succeeds >> 6. Concurrent push #2 starts update-server-info. Reads X=F Y=F >> 7. Concurrent push #2's update-server-info finishes, X=F Y=F written to "info" >> 8. Concurrent push #1's update-server-info finishes, X=A Y=F written to "info" >> >> I.e. because we have per-ref locks and no lock at all on >> update-server-info (but that would need to be a global ref lock, not >> just on the "info" files) we can have a push that's already read "X"'s >> value as "A" while updating "Y" win the race against an >> update-server-info that updated "X"'s value to "F". >> >> It will get fixed on the next push (at least as far as "X"'s value >> goes), but until that time dumb clients will falsely see that "X" hasn't >> been updated. > > That's the same situation. But I thought we were talking about having an > update-server-info lock. In which case the #2 update-server-info or the > #1 update-server-info runs in its entirety, and cannot have their read > and write steps interleaved (that's what I meant by "don't look at the > state until we take the lock"). Then that gives us a strict ordering: we > know that _some_ update-server-info (be it #1 or #2's) will run after > any given update. Yeah you're right. I *thought* in my last E-mail we were talking about the current state, but re-reading upthread I see that was a fail on my part. An update-server-info lock would solve this indeed. We could still end up with a situation where whatever a naïve version of the lockfile API would fail for the "new" update since the old one was underway, so we'd need something similar to core.*Ref*Timeout, but if we ran into a *.lock or the timeout we could exit non-zero, as opposed to silently failing like it does now when it races.