On 07/23, Linus Torvalds wrote: > > So here's a v2, now as a "real" commit with a commit message and everything. I am already sleeping, will read it tomorrow, but at first glance... > @@ -1013,18 +1014,40 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync, > if (wait_page->bit_nr != key->bit_nr) > return 0; > > + /* Stop walking if it's locked */ > + if (wait->flags & WQ_FLAG_EXCLUSIVE) { > + if (test_and_set_bit(key->bit_nr, &key->page->flags)) > + return -1; > + } else { > + if (test_bit(key->bit_nr, &key->page->flags)) > + return -1; > + } > + > /* > - * Stop walking if it's locked. > - * Is this safe if put_and_wait_on_page_locked() is in use? > - * Yes: the waker must hold a reference to this page, and if PG_locked > - * has now already been set by another task, that task must also hold > - * a reference to the *same usage* of this page; so there is no need > - * to walk on to wake even the put_and_wait_on_page_locked() callers. > + * Let the waiter know we have done the page flag > + * handling for it (and the return value lets the > + * wakeup logic count exclusive wakeup events). > */ > - if (test_bit(key->bit_nr, &key->page->flags)) > - return -1; > + ret = (wait->flags & WQ_FLAG_EXCLUSIVE) != 0; > + wait->flags |= WQ_FLAG_WOKEN; > + wake_up_state(wait->private, mode); > > - return autoremove_wake_function(wait, mode, sync, key); > + /* > + * Ok, we have successfully done what we're waiting for, > + * and we can unconditionally remove the wait entry. > + * > + * Note that this has to be the absolute last thing we do, > + * since after list_del_init(&wait->entry) the wait entry > + * might be de-allocated and the process might even have > + * exited. > + * > + * We _really_ should have a "list_del_init_careful()" to > + * properly pair with the unlocked "list_empty_careful()" > + * in finish_wait(). > + */ > + smp_mb(); > + list_del_init(&wait->entry); I think smp_wmb() would be enough, but this is minor. We need a barrier between "wait->flags |= WQ_FLAG_WOKEN" and list_del_init(), But afaics we need another barrier, rmb(), in wait_on_page_bit_common() for the case when wait->private was not blocked; we need to ensure that if finish_wait() sees list_empty_careful() == T then we can't miss WQ_FLAG_WOKEN. Oleg.