On 12/28, Oleg Nesterov wrote: > > > int __wake_up(struct wait_queue_head *wq_head, unsigned int mode, > > int nr_exclusive, void *key) > > { > > + if (list_empty(&wq_head->head)) { > > + struct list_head *pn; > > + > > + /* > > + * pairs with spin_unlock_irqrestore(&wq_head->lock); > > + * We actually do not need to acquire wq_head->lock, we just > > + * need to be sure that there is no prepare_to_wait() that > > + * completed on any CPU before __wake_up was called. > > + * Thus instead of load_acquiring the spinlock and dropping > > + * it again, we load_acquire the next list entry and check > > + * that the list is not empty. > > + */ > > + pn = smp_load_acquire(&wq_head->head.next); > > + > > + if(pn == &wq_head->head) > > + return 0; > > + } > > Too subtle for me ;) > > I have some concerns, but I need to think a bit more to (try to) actually > understand this change. If nothing else, consider int CONDITION; wait_queue_head_t WQ; void wake(void) { CONDITION = 1; wake_up(WQ); } void wait(void) { DEFINE_WAIT_FUNC(entry, woken_wake_function); add_wait_queue(WQ, entry); if (!CONDITION) wait_woken(entry, ...); remove_wait_queue(WQ, entry); } this code is correct even if LOAD(CONDITION) can leak into the critical section in add_wait_queue(), so CPU running wait() can actually do // add_wait_queue spin_lock(WQ->lock); LOAD(CONDITION); // false! list_add(entry, head); spin_unlock(WQ->lock); if (!false) // result of the LOAD above wait_woken(entry, ...); Now suppose that another CPU executes wake() between LOAD(CONDITION) and list_add(entry, head). With your patch wait() will miss the event. The same for __pollwait(), I think... No? Oleg.