On Thu, 22 Aug 2024, Christian Brauner wrote: > > > > So this barrier isn't about the bit. This barrier is about the > > wait_queue. In particular it is about waitqueue_active() call at the > > start of wake_up_var(). If that test wasn't there and if instead > > wake_up_var() conditionally called __wake_up(), then there would be no > > Did you mean "unconditionally called"? Yes :-) > > > need for any barrier. A comment near wake_up_bit() makes it clear that > > the barrier is only needed when the spinlock is avoided. > > > > On a weakly ordered arch, this test can be effected *before* the write > > of the bit. If the waiter adds itself to the wait queue and then tests > > the bit before the bit is set, but after the waitqueue_active() test is > > put in effect, then the wake_up will never be sent. > > > > But .... this is all academic of this code because you don't need a > > barrier at all. The wake_up happens in a spin_locked region, and the > > wait is entirely inside the same spin_lock, except for the schedule. A > > later patch has: > > spin_unlock(); schedule(); spin_lock(); > > > > So there is no room for a race. If the bit is cleared before the > > wait_var_event() equivalent, then no wakeup is needed. When the lock is > > dropped after the bit is cleared the unlock will have all the barriers > > needed for the bit to be visible. > > > > The only other place that the bit can be cleared is concurrent with the > > above schedule() while the spinlock isn't held by the waiter. In that > > case it is again clear that no barrier is needed - or that the > > spin_unlock/lock provide all the needed barriers. > > > > So a better comment would be > > > > /* no barrier needs as both waker and waiter are in spin-locked regions */ > > Thanks for the analysis. I was under the impression that wait_on_inode() > was called in contexts where no barrier is guaranteed and the bit isn't > checked with spin_lock() held. > Yes it is. I was looking at the I_SYNC waiter and mistakenly thought it was a model for the others. A wake_up for I_SYNC being cleared doesn't need the barrier. Maybe inode_wake_up_bit() should not have a barrier and the various callers should add whichever barrier is appropriate. That is the model that Linus prefers for wake_up_bit() and for consistency it should apply to inode_wake_up_bit() too. Thanks, NeilBrown