Re: [PATCH 2/6] rcu: Remove superfluous full memory barrier upon first EQS snapshot

Frederic Weisbecker <frederic@xxxxxxxxxx> · Fri, 17 May 2024 13:40:11 +0200

Le Fri, May 17, 2024 at 09:29:14AM +0200, Andrea Parri a écrit :
> I know my remark may seem a little biased,  ;-) but the semantics of
> smp_mb__after_unlock_lock() and smp_mb__after_spinlock() have been
> somehowr/formally documented in the LKMM.  This means, in particular,
> that one can write "litmus tests" with the barriers at stake and then
> "run"/check such tests against the _current model.
> 
> For example,  (based on inline comments in include/linux/spinlock.h)
> 
> $ cat after_spinlock.litmus
> C after_spinlock
> 
> { }
> 
> P0(int *x, spinlock_t *s)
> {
> 	spin_lock(s);
> 	WRITE_ONCE(*x, 1);
> 	spin_unlock(s);
> }
> 
> P1(int *x, int *y, spinlock_t *s)
> {
> 	int r0;
> 
> 	spin_lock(s);
> 	smp_mb__after_spinlock();
> 	r0 = READ_ONCE(*x);
> 	WRITE_ONCE(*y, 1);
> 	spin_unlock(s);
> }
> 
> P2(int *x, int *y)
> {
> 	int r1;
> 	int r2;
> 
> 	r1 = READ_ONCE(*y);
> 	smp_rmb();
> 	r2 = READ_ONCE(*x);
> }
> 
> exists (1:r0=1 /\ 2:r1=1 /\ 2:r2=0)
> 
> $ herd7 -conf linux-kernel.cfg after_spinlock.litmus
> Test after_spinlock Allowed
> States 7
> 1:r0=0; 2:r1=0; 2:r2=0;
> 1:r0=0; 2:r1=0; 2:r2=1;
> 1:r0=0; 2:r1=1; 2:r2=0;
> 1:r0=0; 2:r1=1; 2:r2=1;
> 1:r0=1; 2:r1=0; 2:r2=0;
> 1:r0=1; 2:r1=0; 2:r2=1;
> 1:r0=1; 2:r1=1; 2:r2=1;
> No
> Witnesses
> Positive: 0 Negative: 7
> Condition exists (1:r0=1 /\ 2:r1=1 /\ 2:r2=0)
> Observation after_spinlock Never 0 7
> Time after_spinlock 0.01
> Hash=b377bde8fe3565fcdd0eb2bdfaf3351e
> 
> Notice that, according to the current model at least, the state in
> the above "exists" clause remains forbidden _after removal of the
> smp_mb__after_spinlock() barrier.  In this sense, if you want, the
> inline comment (I contributed to) is misleading/incomplete.  :-/

Z6.0+pooncelock+poonceLock+pombonce.litmus shows an example of
how full ordering is subtely incomplete without smp_mb__after_spinlock().

But still, smp_mb__after_unlock_lock() is supposed to be weaker than
smp_mb__after_spinlock() and yet I'm failing to produce a litmus test
that is successfull with the latter and fails with the former.

For example, and assuming smp_mb__after_unlock_lock() is expected to be
chained across locking, here is a litmus test inspired by
Z6.0+pooncelock+poonceLock+pombonce.litmus that never observes the condition
even though I would expect it should, as opposed to using
smp_mb__after_spinlock():

C smp_mb__after_unlock_lock

{}

P0(int *w, int *x, spinlock_t *mylock)
{
	spin_lock(mylock);
	WRITE_ONCE(*w, 1);
	WRITE_ONCE(*x, 1);
	spin_unlock(mylock);
}

P1(int *x, int *y, spinlock_t *mylock)
{
	int r0;

	spin_lock(mylock);
	smp_mb__after_unlock_lock();
	r0 = READ_ONCE(*x);
	WRITE_ONCE(*y, 1);
	spin_unlock(mylock);
}

P2(int *y, int *z, spinlock_t *mylock)
{
	int r0;

	spin_lock(mylock);
	r0 = READ_ONCE(*y);
	WRITE_ONCE(*z, 1);
	spin_unlock(mylock);
}

P3(int *w, int *z)
{
	int r1;

	WRITE_ONCE(*z, 2);
	smp_mb();
	r1 = READ_ONCE(*w);
}

exists (1:r0=1 /\ 2:r0=1 /\ z=2 /\ 3:r1=0)