On Wed, Jul 24, 2019 at 04:05:17AM -0400, Michael S. Tsirkin wrote: > On Wed, Jul 24, 2019 at 10:17:14AM +0800, Jason Wang wrote: > > So even PTE is read speculatively before reading invalidate_count (only in > > the case of invalidate_count is zero). The spinlock has guaranteed that we > > won't read any stale PTEs. > > I'm sorry I just do not get the argument. > If you want to order two reads you need an smp_rmb > or stronger between them executed on the same CPU. No, that is only for unlocked algorithms. In this case the spinlock provides all the 'or stronger' ordering required. For invalidate_count going 0->1 the spin_lock ensures that any following PTE update during invalidation does not order before the spin_lock() While holding the lock and observing 1 in invalidate_count the PTE values might be changing, but are ignored. C's rules about sequencing make this safe. For invalidate_count going 1->0 the spin_unlock ensures that any preceeding PTE update during invalidation does not order after the spin_unlock While holding the lock and observing 0 in invalidating_count the PTE values cannot be changing. Jason