On Thu, Dec 20, 2018 at 8:05 PM Jens Axboe <axboe@xxxxxxxxx> wrote: > > Jens Axboe (108): > block: avoid ordered task state change for polled IO This one seems *very* questionable. The commit message is misleading: For the core poll helper, the task state setting don't need to imply any atomics, as it's the current task itself that is being modified and we're not going to sleep. For IRQ driven, the wakeup path have the necessary barriers to not need us using the heavy handed version of the task state setting. What? Barriers on one side have *no* meaning or point if the *other* side doesn't have any barriers. That's completely magical thinking. But it's also not at all the case that such barriers even exist. The wakup side does: struct task_struct *waiter = dio->submit.waiter; WRITE_ONCE(dio->submit.waiter, NULL); blk_wake_io_task(waiter); and here the sleeping side now does: __set_current_state(TASK_UNINTERRUPTIBLE); if (!READ_ONCE(dio->submit.waiter)) break; which is entirely unordered. So just what protects against this: Sleeper: Waker (different cpu) read submit.waker (sees non-NULL) WRITE_ONCE(dio->submit.waiter, NULL); blk_wake_io_task(waiter); write TASK_UNINTERRUPTIBLE io_schedule() and now the sleeper sleeps forever, because there will be nobody who ever wakes it up (the wakeup happened before the write of TASK_UNINTERRUPTIBLE). Notice how it does not matter AT ALL what barriers that other CPU is doing when waking things up. The problem is the sleeping CPU having reordered the check for "oh, there's a waiter", and "tell people I'm going to sleep". Those memory barriers matter. You can't just remove them randomly and claim they don't. Why would you even remove it in the first place? Guys, if you don't understand memory ordering, then you have absolutely no business using __set_current_state(). And talking about barriers on the other side of the equation clearly shows that you don't seem to understand memory ordering. Barriers are only *ever* meaningful if they exist on both sides (although sometimes said barriers can be implicit: an address dependency or similar in RCU, now that we've decided to not care abotu the crazy alpha read-barrier-depends) Maybe I'm missing something, but this really looks like a completely invalid "optimization" to me. And it's entirely bogus too. If that memory barrier matters, you're almost certainly doing something wrong (most likely benchmarking something pointless). Linus