Hi Jens On 11/13/18 12:26 AM, Jens Axboe wrote: > On 11/12/18 2:35 AM, jianchao.wang wrote: >> Hi Jens >> >> On 11/10/18 11:13 PM, Jens Axboe wrote: >>> We only really need the barrier if we're going to be sleeping, >>> if we're just polling we're fine with the __set_current_state() >>> variant. >>> >>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> >>> --- >>> fs/block_dev.c | 25 +++++++++++++++++++++---- >>> 1 file changed, 21 insertions(+), 4 deletions(-) >>> >>> diff --git a/fs/block_dev.c b/fs/block_dev.c >>> index c039abfb2052..e7550b1f9670 100644 >>> --- a/fs/block_dev.c >>> +++ b/fs/block_dev.c >>> @@ -237,12 +237,23 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter, >>> >>> qc = submit_bio(&bio); >>> for (;;) { >>> - set_current_state(TASK_UNINTERRUPTIBLE); >>> + /* >>> + * Using non-atomic task state for polling >>> + */ >>> + __set_current_state(TASK_UNINTERRUPTIBLE); >>> + >>> if (!READ_ONCE(bio.bi_private)) >>> break; >> >> When we don't use polling, the blkdev_bio_end_io_simple may come on a different cpu before >> submit_bio returns >> For example, >> __blkdev_direct_IO_simple >> qc = submit_bio(&bio) >> blkdev_bio_end_io_simple >> WRITE_ONCE(bio.bi_private, NULL) >> wake_up_process(waiter) >> __set_current_state(TASK_UNINTERRUPTIBLE) >> READ_ONCE(bio.bi_private) > > While I agree that you are right, that's an existing issue. The store > barrier implied by set_current_state() doesn't fix that. > > How about this variant: > > https://urldefense.proofpoint.com/v2/url?u=http-3A__git.kernel.dk_cgit_linux-2Dblock_commit_-3Fh-3Dmq-2Dperf-26id-3D036ba7a8d1334889c0fe55d4858d78f4e8022f12&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ&m=pjbk_9rxuy7OS1pCkQ9MAHzFf4eT27Yt4RL8jASDJCA&s=AnmYMIcGpIILUrjDFsabj61pcTyMCRHB1Pu8XXi47t0&e= > > which then changes the later wake up helper patch to be: > > https://urldefense.proofpoint.com/v2/url?u=http-3A__git.kernel.dk_cgit_linux-2Dblock_commit_-3Fh-3Dmq-2Dperf-26id-3Df8c3f188425967adb040cfefb799b0a5a1df769d&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ&m=pjbk_9rxuy7OS1pCkQ9MAHzFf4eT27Yt4RL8jASDJCA&s=PdajuUul55e6p0FED_AzzWt5Gip0ekLi1eGYBMLiZPo&e= > The wake up path has contained a full memory barrier with the raw_spin_lock and following smp_mb__after_spinlock wake_up_process -> try_to_wake_up raw_spin_lock_irqsave(&p->pi_lock, flags); smp_mb__after_spinlock(); So a smp_rmb() could be enough. :) Thanks Jianchao