Re: [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 15, 2021 at 04:28:21PM +0100, Will Deacon wrote:
> On Thu, Apr 15, 2021 at 05:03:58PM +0200, Peter Zijlstra wrote:
> > @@ -73,9 +75,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
> >  
> >  	/* When no more readers or writers, set the locked flag */
> >  	do {
> > -		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> > -	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
> > -					_QW_LOCKED) != _QW_WAITING);
> > +		cnt = atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> 
> I think the issue is that >here< a concurrent reader in interrupt context
> can take the lock and release it again, but we could speculate reads from
> the critical section up over the later release and up before the control
> dependency here...

OK, so that was totally not clear from the original changelog.

> > +	} while (!atomic_try_cmpxchg_relaxed(&lock->cnts, &cnt, _QW_LOCKED));
> 
> ... and then this cmpxchg() will succeed, so our speculated stale reads
> could be used.
> 
> *HOWEVER*
> 
> Speculating a read should be fine in the face of a concurrent _reader_,
> so for this to be an issue it implies that the reader is also doing some
> (atomic?) updates.

So we're having something like:

	CPU0				CPU1

	queue_write_lock_slowpath()
	  atomic_cond_read_acquire() == _QW_WAITING

					queued_read_lock_slowpath()
					  atomic_cond_read_acquire()
					  return;
   ,--> (before X's store)
   |					X = y;
   |
   |					queued_read_unlock()
   |					  (void)atomic_sub_return_release()
   |
   |	  atomic_cmpxchg_relaxed(.old = _QW_WAITING)
   |
    `--	r = X;


Which as Will said is a cmpxchg ABA, however for it to be ABA, we have
to observe that unlock-store, and won't we then also have to observe the
whole critical section?

Ah, the issue is yet another load inside our own (CPU0)'s critical
section. which isn't ordered against the cmpxchg_relaxed() and can be
issued before.

So yes, then making the cmpxchg an acquire, will force all our own loads
to happen later.



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux