Re: [PATCH v4 3/4] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

Guo Ren <guoren@xxxxxxxxxx> · Wed, 31 Mar 2021 23:22:35 +0800

On Mon, Mar 29, 2021 at 8:50 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Mon, Mar 29, 2021 at 08:01:41PM +0800, Guo Ren wrote:
> > u32 a = 0x55aa66bb;
> > u16 *ptr = &a;
> >
> > CPU0                       CPU1
> > =========             =========
> > xchg16(ptr, new)     while(1)
> >                                     WRITE_ONCE(*(ptr + 1), x);
> >
> > When we use lr.w/sc.w implement xchg16, it'll cause CPU0 deadlock.
>
> Then I think your LL/SC is broken.
No, it's not broken LR.W/SC.W. Quote <8.3 Eventual Success of
Store-Conditional Instructions>:

"As a consequence of the eventuality guarantee, if some harts in an
execution environment are
executing constrained LR/SC loops, and no other harts or devices in
the execution environment
execute an unconditional store or AMO to that reservation set, then at
least one hart will
eventually exit its constrained LR/SC loop. By contrast, if other
harts or devices continue to
write to that reservation set, it is not guaranteed that any hart will
exit its LR/SC loop."

So I think it's a feature of LR/SC. How does the above code (also use
ll.w/sc.w to implement xchg16) running on arm64?

1: ldxr
    eor
    cbnz ... 2f
    stxr
    cbnz ... 1b   // I think it would deadlock for arm64.

"LL/SC fwd progress" which you have mentioned could guarantee stxr
success? How hardware could do that?

>
> That also means you really don't want to build super complex locking
> primitives on top, because that live-lock will percolate through.
>
> Step 1 would be to get your architecute fixed such that it can provide
> fwd progress guarantees for LL/SC. Otherwise there's absolutely no point
> in building complex systems with it.
--
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/