On Thu, 2 Jan 2025 at 08:33, Oleg Nesterov <oleg@xxxxxxxxxx> wrote: > > I was going to send a one-liner patch which adds mb() into pipe_poll() > but then I decided to make even more spam and ask some questions first. poll functions are not *supposed* to need memory barriers. They are supposed to do "poll_wait()" and then not need any more serialization after that, because we either (a) have a NULL wait-address, in which case we're not going to sleep and this is just a "check state" (b) the waiting function is supposed to do add_wait_queue() (usually by way of __pollwait) and that should be a sufficient barrier to anybody who does a wakeup Note that add_wait_queue() ends up doing a spinlock sequence, and while that is not a full memory barrier (well, it is on x86, but not necessarily in general), it *should* be sufficient against an actual waker. That's kind of how add_wait_queue() vs wake_up() is supposed to work. Of course, the fact that we're not discussing the pipe code *not* doing a full wake sequence (but just a "is the wait queue empty" thing) is what then messes with the generic rules. And this makes me think that the whole comment above waitqueue_active() is just fundamentally wrong. The smp_mb() is *not* sufficient in the sequence smp_mb(); if (waitqueue_active(wq_head)) wake_up(wq_head); because while it happens to work wrt prepare_to_wait() sequences, is is *not* against other users of add_wait_queue(). In those other sequences the smp_mb() in set_current_state might have happened long long before. Those other users aren't just 'poll()', btw. There's quite a lot of add_wait_queue() users in the kernel. It's a traditional model even if it's not something people generally add to any more. Now, hopefully many of those add_wait_queue() users end up using set_current_state() and getting the memory barrier that way. Or they use wait_woken() or any of the other proper helpers we have. But I think this poll() thing is very much an example of this *not* being valid, and I don't think it's in any way pipe-specific. So maybe we really do need to add the memory barrier to __add_wait_queue(). That's going to be painful, particularly with lots of users not needing it because they have the barrier in all the other places. End result: maybe adding it just to __pollwait() is the thing to do, in the hopes that non-poll users all use the proper sequences. But no, this is most definitely not a pipe-only thing. Linus