Re: [PATCH v3 2/2] block,iomap: disable iopoll when split needed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 16, 2020 at 05:18:51PM +0800, Jeffle Xu wrote:
> Both blkdev fs and iomap-based fs (ext4, xfs, etc.) currently support
> sync iopoll. One single bio can contain at most BIO_MAX_PAGES, i.e. 256
> bio_vec. If the input iov_iter contains more than 256 segments, then
> one dio will be split into multiple bios, which may cause potential
> deadlock for sync iopoll.
> 
> When it comes to sync iopoll, the bio is submitted without REQ_NOWAIT
> flag set and the process may hang in blk_mq_get_tag() if the dio needs
> to be split into multiple bios and thus can rapidly exhausts the queue
> depth. The process has to wait for the completion of the previously
> allocated requests, which should be reaped by the following sync
> polling, and thus causing a deadlock.
> 
> In fact there's a subtle difference of handling of HIPRI IO between
> blkdev fs and iomap-based fs, when dio need to be split into multiple
> bios. blkdev fs will set REQ_HIPRI for only the last split bio, leaving
> the previous bios queued into normal hardware queues, and not causing
> the trouble described above. iomap-based fs will set REQ_HIPRI for all
> split bios, and thus may cause the potential deadlock decribed above.
> 
> Thus disable iopoll when one dio need to be split into multiple bios.
> Though blkdev fs may not suffer this issue, still it may not make much
> sense to iopoll for big IO, since iopoll is initially for small size,
> latency sensitive IO.
> 
> Signed-off-by: Jeffle Xu <jefflexu@xxxxxxxxxxxxxxxxx>
> ---
>  fs/block_dev.c       | 7 +++++++
>  fs/iomap/direct-io.c | 8 ++++++++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 9e84b1928b94..1b56b39e35b5 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -436,6 +436,13 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
>  			break;
>  		}
>  
> +		/*
> +		 * The current dio need to be split into multiple bios here.
> +		 * iopoll is initially for small size, latency sensitive IO,
> +		 * and thus disable iopoll if split needed.
> +		 */
> +		iocb->ki_flags &= ~IOCB_HIPRI;
> +

Not sure if it is good to clear IOCB_HIPRI of iocb, since it is usually
maintained by upper layer code(io_uring, aio, ...) and we shouldn't
touch this flag here.

>  		if (!dio->multi_bio) {
>  			/*
>  			 * AIO needs an extra reference to ensure the dio
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index c1aafb2ab990..46668cceefd2 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -308,6 +308,14 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
>  		copied += n;
>  
>  		nr_pages = iov_iter_npages(dio->submit.iter, BIO_MAX_PAGES);
> +		/*
> +		 * The current dio need to be split into multiple bios here.
> +		 * iopoll is initially for small size, latency sensitive IO,
> +		 * and thus disable iopoll if split needed.
> +		 */
> +		if (nr_pages)
> +			dio->iocb->ki_flags &= ~IOCB_HIPRI;

Same concern as above.

Thanks,
Ming




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux