On Wed, Jul 24, 2019 at 01:53:17PM -0300, Jason Gunthorpe wrote: > On Wed, Jul 24, 2019 at 04:05:17AM -0400, Michael S. Tsirkin wrote: > > On Wed, Jul 24, 2019 at 10:17:14AM +0800, Jason Wang wrote: > > > So even PTE is read speculatively before reading invalidate_count (only in > > > the case of invalidate_count is zero). The spinlock has guaranteed that we > > > won't read any stale PTEs. > > > > I'm sorry I just do not get the argument. > > If you want to order two reads you need an smp_rmb > > or stronger between them executed on the same CPU. > > No, that is only for unlocked algorithms. > > In this case the spinlock provides all the 'or stronger' ordering > required. > > For invalidate_count going 0->1 the spin_lock ensures that any > following PTE update during invalidation does not order before the > spin_lock() > > While holding the lock and observing 1 in invalidate_count the PTE > values might be changing, but are ignored. C's rules about sequencing > make this safe. > > For invalidate_count going 1->0 the spin_unlock ensures that any > preceeding PTE update during invalidation does not order after the > spin_unlock > > While holding the lock and observing 0 in invalidating_count the PTE > values cannot be changing. > > Jason Oh right. So prefetch holds the spinlock the whole time. Sorry about the noise. -- MST