On 07/22, Linus Torvalds wrote: > > Comments? Oleg, this should fix the race you talked about too. Yes. I still can't convince myself thatI fully understand this patch but I see nothing really wrong after a quick glance... > + * We can no longer use 'wait' after we've done the > + * list_del_init(&wait->entry), Yes, but see below, > + * the target may decide it's all done with no > + * other locking, and 'wait' has been allocated on > + * the stack of the target. > */ > - if (test_bit(key->bit_nr, &key->page->flags)) > - return -1; > + target = wait->private; > + smp_mb(); > > - return autoremove_wake_function(wait, mode, sync, key); > + /* > + * Ok, we have successfully done what we're waiting for. > + * > + * Now unconditionally remove the wait entry, so that the > + * waiter can use that to see success or not. > + * > + * We _really_ should have a "list_del_init_careful()" > + * to properly pair with an unlocked "list_empty_careful()". > + */ > + list_del_init(&wait->entry); > + > + /* > + * Theres's another memory barrier in the wakup path, that > + * makes sure the wakup happens after the above is visible > + * to the target. > + */ > + wake_up_state(target, mode); We can no longer use 'target'. If it was already woken up it can notice list_empty_careful(), return without taking q->lock, and exit. Of course, this is purely theoretical... rcu_read_lock() should help but perhaps we can avoid it somehow? Say, can't we abuse WQ_FLAG_WOKEN? wake_page_function: wait->flags |= WQ_FLAG_WOKEN; wmb(); autoremove_wake_function(...); wait_on_page_bit_common: for (;;) { set_current_state(); if (wait.flags & WQ_FLAG_WOKEN) break; schedule(); } finish_wait(); rmb(); return wait.flags & WQ_FLAG_WOKEN ? 0 : -EINTR; Another (cosmetic) problem is that wake_up_state(mode) looks confusing. It is correct but only because we know that mode == TASK_NORMAL and thus wake_up_state() can'fail if the target is still blocked. > + spin_lock_irq(&q->lock); > + SetPageWaiters(page); > + if (!trylock_page_bit_common(page, bit_nr, behavior)) > + __add_wait_queue_entry_tail(q, wait); do we need SetPageWaiters() if trylock() succeeds ? Oleg.