On Mon, Sep 12, 2022 at 11:10:17AM +0000, Hernan Luis Ponce de Leon wrote: > I think it is somehow possible to show the liveness violation using herd7 and the following variant of the code > > ------------------------------------------------------------------------ > C Liveness > { > atomic_t x = ATOMIC_INIT(0); > atomic_t y = ATOMIC_INIT(0); > } > > > P0(atomic_t *x) { > // clear_pending_set_locked > int r0 = atomic_fetch_add(2,x) ; > } > > P1(atomic_t *x, int *z, int *y) { > // this store breaks liveness > WRITE_ONCE(*y, 1); > // queued_spin_trylock > int r0 = atomic_read(x); > // barrier after the initialisation of nodes > smp_wmb(); > // xchg_tail > int r1 = atomic_cmpxchg_relaxed(x,r0,42); > // link node into the waitqueue > WRITE_ONCE(*z, 1); > } > > P2(atomic_t *x,int *z, int *y) { > // node initialisation > WRITE_ONCE(*z, 2); > // queued_spin_trylock > int r0 = atomic_read(x); > // barrier after the initialisation of nodes > smp_wmb(); > // if we read z==2 we expect to read this store > WRITE_ONCE(*y, 0); > // xchg_tail > int r1 = atomic_cmpxchg_relaxed(x,r0,24); > // spinloop > int r2 = READ_ONCE(*y); > int r3 = READ_ONCE(*z); > } > > exists (z=2 /\ y=1 /\ 2:r2=1 /\ 2:r3=2) > ------------------------------------------------------------------------ > > Condition "2:r3=2" forces the spinloop to read from the first write in P2 and "z=2" forces this write > to be last in the coherence order. Conditions "2:r2=1" and "y=1" force the same for the other read. > herd7 says this behavior is allowed by LKMM, showing that liveness can be violated. > > In all the examples above, if we use mb() instead of wmb(), LKMM does not accept > the behavior and thus liveness is guaranteed. That's right. The issue may be somewhat confused by the fact that there have been _two_ separate problems covered in this discussion. One has to do with the use of relaxed vs. non-relaxed atomic accesses, and the other -- this one -- has to do with liveness (eventual termination of a spin loop). You can see the distinction quite clearly by noticing in the litmus test above, the variable x plays absolutely no role. There are no dependencies from it, it isn't accessed by any instructions that include an acquire or release memory barrier, and it isn't used in the "exists" clause. If we remove x from the test (and remove P0, which is now vacuous), and we also remove the unneeded reads at the end of P2 (unneeded because they observe the co-final values stored in y and z), what remains is: P1(int *z, int *y) { WRITE_ONCE(*y, 1); smp_wmb(); WRITE_ONCE(*z, 1); } P2(int *z, int *y) { WRITE_ONCE(*z, 2); smp_wmb(); WRITE_ONCE(*y, 0); } exists (z=2 /\ y=1) which is exactly the 2+2W pattern. As others have pointed out, this pattern is permitted by LKML because we never found a good reason to forbid it, even though it cannot occur on any real hardware that supports Linux. On the other hand, the simplicity of this "liveness" test leads me to wonder if it isn't missing some crucial feature of the actual qspinlock implementation. Alan