On Thu, Jul 23, 2020 at 11:01 AM Oleg Nesterov <oleg@xxxxxxxxxx> wrote: > > > + * > > + * We _really_ should have a "list_del_init_careful()" to > > + * properly pair with the unlocked "list_empty_careful()" > > + * in finish_wait(). > > + */ > > + smp_mb(); > > + list_del_init(&wait->entry); > > I think smp_wmb() would be enough, but this is minor. Well, what we _really_ want (and what that comment is about) would be got that list_del_init_careful() to use a "smp_store_release()" for the last store, and then "list_empty_careful()" would use a "smp_load_acquire()" for the corresponding first load. On x86, that's free. On most other architectures, it's the minimal ordering requirement. And no, I don't think "smp_wmb()" is technically enough. With crazy memory ordering, one of the earlier *reads* (eg loading "wait->private" when waking things up) could have been delayed until after the stores that initialize the list - and thus read stack contents from another process after it's been released and re-used. Does that happen in reality? No. There are various conditionals in there which means that the stores end up being gated on the loads and cannot actually be re-ordered, but it's the kind of subtley So we actually do want to constrain all earlier reads and writes wrt the final write. Which is exactly what "smp_store_release()" does. But with our current list_empty_careful(), the smp_mb() is I think technically sufficient. > We need a barrier between "wait->flags |= WQ_FLAG_WOKEN" and list_del_init(), See above: we need more than just that write barrier, although in _practice_ you're right, and the other barriers actually all already exist and are part of wake_up_state(). So the smp_mb() is unnecessary, and in fact your smp_wmb() would be too. But I left it there basically as "documentation". > But afaics we need another barrier, rmb(), in wait_on_page_bit_common() fo > the case when wait->private was not blocked; we need to ensure that if > finish_wait() sees list_empty_careful() == T then we can't miss WQ_FLAG_WOKEN. Again, this is what a proper list_empty_careful() with a smp_load_acquire() would have automatically gotten for us. But yes, I think that without that, and with the explicit barriers, we need an smp_rmb() after the list_empty_careful(). I really think it should be _in_ list_empty_careful(), though. Or maybe finish_wait(). Hmm. Because looking at all the other finish_wait() uses, the fact that the waitqueue _list_ is properly ordered isn't really a guarantee of the rest of the stack space is. In practice, it will be, but I think this lack of serialization is a potential real bug elsewhere too. (Obviously none of this would show on x86, where we already *get* that smp_store_release/smp_load_acquire behavior for the existing list_del_init()/list_empty_careful(), since all stores are releases, and all loads are acquires) So I think that is a separate issue, generic to our finish_wait() uses. Linus