On Fri, Aug 18, 2017 at 1:05 PM, Andi Kleen <ak@xxxxxxxxxxxxxxx> wrote: > > I think what's happening is that it allows more parallelism during wakeup: > > Normally it's like > > CPU 1 CPU 2 CPU 3 ..... > > LOCK > wake up tasks on other CPUs woken up woken up > UNLOCK SPIN on waitq lock SPIN on waitq lock Hmm. The processes that are woken up shouldn't need to touch the waitq lock after wakeup. The default "autoremove_wake_function()" does the wait list removal, so if you just use the normal wait/wakeup, you're all done an don't need to do anythig more. That's very much by design. In fact, it's why "finish_wait()" uses that "list_empty_careful()" thing on the entry - exactly so that it only needs to take the wait queue lock if it is still on the wait list (ie it was woken up by something else). Now, it *is* racy, in the sense that the autoremove_wake_function() will remove the entry *after* having successfully woken up the process, so with bad luck and a quick wakeup, the woken process may not see the good list_empty_careful() case. So we really *should* do the remove earlier inside the pi_lock region in ttwu(). We don't have that kind of interface, though. If you actually do see tasks getting stuck on the waitqueue lock after being woken up, it might be worth looking at, though. The other possibility is that you were looking at cases that didn't use "autoremove_wake_function()" at all, of course. Maybe they are worth fixing. The autoremval really does make a difference, exactly because of the issue you point to. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>