On 01/04, Linus Torvalds wrote: > > On Thu, 2 Jan 2025 at 08:33, Oleg Nesterov <oleg@xxxxxxxxxx> wrote: > > > > I was going to send a one-liner patch which adds mb() into pipe_poll() > > but then I decided to make even more spam and ask some questions first. > > poll functions are not *supposed* to need memory barriers. ... > But no, this is most definitely not a pipe-only thing. Agreed, that is why I didn't send the patch which adds mb() into pipe_poll(). > They are supposed to do "poll_wait()" and then not need any more > serialization after that, because we either > > (a) have a NULL wait-address, in which case we're not going to sleep > and this is just a "check state" To be honest, I don't understand the wait_address check in poll_wait(), it seems that wait_address is never NULL. But ->_qproc can be NULL if do_poll/etc does another iteration after poll_schedule_timeout(), in this case we can sleep again. But this case is fine. > (b) the waiting function is supposed to do add_wait_queue() (usually > by way of __pollwait) and that should be a sufficient barrier to > anybody who does a wakeup Yes. > And this makes me think that the whole comment above > waitqueue_active() is just fundamentally wrong. The smp_mb() is *not* > sufficient in the sequence > > smp_mb(); > if (waitqueue_active(wq_head)) > wake_up(wq_head); > > because while it happens to work wrt prepare_to_wait() sequences, is > is *not* against other users of add_wait_queue(). Well, this comment doesn't look wrong to me, but perhaps it can be more clear. It should probably explain that this pseudo code is only correct because the waiter has a barrier before "if (@cond)" which pairs with smp_mb() above waitqueue_active(). It even says prepare_to_wait(&wq_head, &wait, state); // smp_mb() from set_current_state() but perhaps this is not 100% clear. > But I think this poll() thing is very much an example of this *not* > being valid, and I don't think it's in any way pipe-specific. Yes. > So maybe we really do need to add the memory barrier to > __add_wait_queue(). That's going to be painful, particularly with lots > of users not needing it because they have the barrier in all the other > places. Yes, that is why I don't really like the idea to add mb() into __add_wait_queue(). > End result: maybe adding it just to __pollwait() is the thing to do, > in the hopes that non-poll users all use the proper sequences. That is what I tried to propose. Will you agree with this change? We can even use smp_store_mb(), say @@ -224,11 +224,12 @@ static void __pollwait(struct file *filp, wait_queue_head_t *wait_address, if (!entry) return; entry->filp = get_file(filp); - entry->wait_address = wait_address; entry->key = p->_key; init_waitqueue_func_entry(&entry->wait, pollwake); entry->wait.private = pwq; add_wait_queue(wait_address, &entry->wait); + // COMMENT + smp_store_mb(entry->wait_address, wait_address); Oleg.