Re: [PATCH] update-server-info: avoid needless overwrites

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Tue, 14 May 2019 13:57:35 +0200

On Tue, May 14 2019, Jeff King wrote:

> On Tue, May 14, 2019 at 12:33:11PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> > I think it would work because any update-server-info, whether from A or
>> > B, will take into account the full current repo state (and we don't look
>> > at that state until we take the lock). So you might get an interleaved
>> > "A-push, B-push, B-maint, A-maint", but that's OK. A-maint will
>> > represent B's state when it runs.
>>
>> Maybe we're talking about different things. I mean the following
>> sequence:
>>
>>  1. Refs "X" and "Y" are at X=A Y=A
>>  2. Concurrent push #1 happens, updating X from A..F
>>  3. Concurrent push #2 happens, updating Y from A..F
>>  4. Concurrent push #1 succeeds
>>  5. Concurrent push #1 starts update-server-info. Reads X=F Y=A
>>  5. Concurrent push #2 succeeds
>>  6. Concurrent push #2 starts update-server-info. Reads X=F Y=F
>>  7. Concurrent push #2's update-server-info finishes, X=F Y=F written to "info"
>>  8. Concurrent push #1's update-server-info finishes, X=A Y=F written to "info"
>>
>> I.e. because we have per-ref locks and no lock at all on
>> update-server-info (but that would need to be a global ref lock, not
>> just on the "info" files) we can have a push that's already read "X"'s
>> value as "A" while updating "Y" win the race against an
>> update-server-info that updated "X"'s value to "F".
>>
>> It will get fixed on the next push (at least as far as "X"'s value
>> goes), but until that time dumb clients will falsely see that "X" hasn't
>> been updated.
>
> That's the same situation. But I thought we were talking about having an
> update-server-info lock. In which case the #2 update-server-info or the
> #1 update-server-info runs in its entirety, and cannot have their read
> and write steps interleaved (that's what I meant by "don't look at the
> state until we take the lock"). Then that gives us a strict ordering: we
> know that _some_ update-server-info (be it #1 or #2's) will run after
> any given update.

Yeah you're right. I *thought* in my last E-mail we were talking about
the current state, but re-reading upthread I see that was a fail on my
part.

An update-server-info lock would solve this indeed. We could still end
up with a situation where whatever a naïve version of the lockfile API
would fail for the "new" update since the old one was underway, so we'd
need something similar to core.*Ref*Timeout, but if we ran into a *.lock
or the timeout we could exit non-zero, as opposed to silently failing
like it does now when it races.