On Thu, Jun 29, 2017 at 09:03:42PM -0700, Linus Torvalds wrote: > On Thu, Jun 29, 2017 at 12:15 PM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: > > On Thu, Jun 29, 2017 at 09:13:29AM -0700, Linus Torvalds wrote: > >> > >> swait uses special locking and has odd semantics that are not at all > >> the same as the default wait queue ones. It should not be used without > >> very strong reasons (and honestly, the only strong enough reason seems > >> to be "RT"). > > > > Performance shortcut: > > > > https://lkml.org/lkml/2016/2/25/301 > > Now, admittedly I don't know the code and really may be entirely off, > but looking at the commit (no need to go to the lkml archives - it's > commit 8577370fb0cb ("KVM: Use simple waitqueue for vcpu->wq") in > mainline), I really think the swait() use is simply not correct if > there can be multiple waiters, exactly because swake_up() only wakes > up a single entry. > > So either there is only a single entry, or *all* the code like > > dvcpu->arch.wait = 0; > > - if (waitqueue_active(&dvcpu->wq)) > - wake_up_interruptible(&dvcpu->wq); > + if (swait_active(&dvcpu->wq)) > + swake_up(&dvcpu->wq); > > is simply wrong. If there are multiple blockers, and you just cleared > "arch.wait", I think they should *all* be woken up. And that's not > what swake_up() does. Code like this is probably wrong for another reason too. The swait_active() is likely redudant, since swake_up() also calls swait_active(). The check in swake_up() returns if it thinks there are no active waiters. However, the synchronization needed to ensure a proper wakeup is left as an exercise to swake_up's caller. There have been a couple of other discussions around this topic recently: https://lkml.org/lkml/2017/5/25/722 https://lkml.org/lkml/2017/6/8/1222 The above is better written as the following, but even then you still have the single/multiple wakeup problem: - if (waitqueue_active(&dvcpu->wq)) - wake_up_interruptible(&dvcpu->wq); + smp_mb(); + swake_up(&dvcpu->wq); Just to add to the confusion, the last time I checked, the semantics of swake_up() even differ between RT Linux and mainline, which makes this even more confusing. -K