Re: [PATCH 4/6] block: avoid ordered task state change for polled IO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jens

On 11/10/18 11:13 PM, Jens Axboe wrote:
> We only really need the barrier if we're going to be sleeping,
> if we're just polling we're fine with the __set_current_state()
> variant.
> 
> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
> ---
>  fs/block_dev.c | 25 +++++++++++++++++++++----
>  1 file changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index c039abfb2052..e7550b1f9670 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -237,12 +237,23 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
>  
>  	qc = submit_bio(&bio);
>  	for (;;) {
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> +		/*
> +		 * Using non-atomic task state for polling
> +		 */
> +		__set_current_state(TASK_UNINTERRUPTIBLE);
> +
>  		if (!READ_ONCE(bio.bi_private))
>  			break;

When we don't use polling, the blkdev_bio_end_io_simple may come on a different cpu before
submit_bio returns 
For example,
__blkdev_direct_IO_simple
  qc = submit_bio(&bio) 
                                          blkdev_bio_end_io_simple
                                            WRITE_ONCE(bio.bi_private, NULL)
                                            wake_up_process(waiter)
  __set_current_state(TASK_UNINTERRUPTIBLE)                              
  READ_ONCE(bio.bi_private)

At the moment, a smp_rmb at least maybe needed before READ_ONCE(bio.bi_private)
to ensure the WRITE_ONCE(bio.bi_private, NULL) is seen if we miss the wakeup.

> +
>  		if (!(iocb->ki_flags & IOCB_HIPRI) ||
> -		    !blk_poll(bdev_get_queue(bdev), qc))
> +		    !blk_poll(bdev_get_queue(bdev), qc)) {
> +			/*
> +			 * If we're scheduling, ensure that task state
> +			 * change is ordered and seen.
> +			 */
> +			smp_mb();
>  			io_schedule();
> +		}
>  	}
>  	__set_current_state(TASK_RUNNING);
>  
> @@ -403,13 +414,19 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
>  		return -EIOCBQUEUED;
>  
>  	for (;;) {
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> +		/*
> +		 * See __blkdev_direct_IO_simple() loop
> +		 */
> +		__set_current_state(TASK_UNINTERRUPTIBLE);
> +

Similar with above.

>  		if (!READ_ONCE(dio->waiter))
>  			break;
>  
>  		if (!(iocb->ki_flags & IOCB_HIPRI) ||
> -		    !blk_poll(bdev_get_queue(bdev), qc))
> +		    !blk_poll(bdev_get_queue(bdev), qc)) {
> +			smp_mb();
>  			io_schedule();
> +		}
>  	}
>  	__set_current_state(TASK_RUNNING);
>  
> 

Thanks
Jianchao



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux