Re: [PATCH v4 27/28] reftable: fixup for new base topic 2/3

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Mon, 30 Aug 2021 16:03:42 +0200

On Mon, Aug 30 2021, Han-Wen Nienhuys wrote:

> On Mon, Aug 30, 2021 at 3:22 PM Ævar Arnfjörð Bjarmason
> <avarab@xxxxxxxxx> wrote:
>> But we will need at least the optimistic locking of code like
>> builtin/reflog.c wanting to do an expiry, and deciding whether to do
>> that expiry based on a given state of the ref/reflog. I.e. we don't
>> want:
>>
>>     1. Start reflog expiry
>>     2. Code in builtin/reflog.c looks up the OID
>>     3. Code in builtin/reflog.c decides whether expire the reflog
>>     4. Concurrent with #4, another writer updates the ref/reflog pair
>>     5. Code in builtin/reflog.c says "OK, expire it!"
>>     6. Reftable queues a delete/prune of the reflog per #5.
>>
>> This would be a sequente of updates to the ref/reflog, none of whom were
>> racy as far as the reftable semantics itself are concerneb, but where
>> we'd do the wrong thing because the writer thought we had A when we
>> really had B. So we need the equivalent of an "git update-ref" with the
>> "<oldvalue>".
>>
>> Is there a better way to do that in this case that I'm missing?
>
> I spent some more time looking at builtin/reflog.c, but I am still not
> 100% sure what the locking is used for.
>
> From a quick glance, the OID goes into tip_commit, and tip_commit goes
> into a reachable list (?). The reachable list is then for something,
> but I can't really tell what.
>
> In your example with 1.-6., it's still not clear to me what the
> undesired behavior is precisely. If the reflog is pruned in #6, is the
> worry that the update in #4 is pruned immediately after being
> effected?

Yes, I think so. But I'm not sure. I skimmed the code quickly today, and
when I wrote the referenced series didn't focus much on the nitty-gritty
of the builtin/reflog.c behavior other than assuring myself that I was
doing the exact same thing as before as far as its logic was concerned.

I.e. it always locked at a given OID. Before my in-flight "reflog
expire: don't lock reflogs using previously seen OI" it might not lock
but get this error:

    error: cannot lock ref '<refname>': ref '<refname>' is at <OID-A> but expected <OID-B>

But at least it wouldn't do anything, but the current code does require
the passed-in OID. See the code that needs "unreachable_expire_kind" and
"tip_commit".

Perhaps that whole thing can also be refactored somehow. If I change the
"commit = lookup_commit(the_repository, oid);" in
"reflog_expiry_prepare()" to just "commit = NULL" all tests pass, but
that might just be missing test coverage in the face of same race...