Re: [v2 2/2] block,iomap: disable iopoll when split needed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 15, 2020 at 03:40:31PM +0800, Jeffle Xu wrote:
> Both blkdev fs and iomap-based fs (ext4, xfs, etc.) currently support
> sync iopoll. One single bio can contain at most BIO_MAX_PAGES, i.e. 256
> bio_vec. If the input iov_iter contains more than 256 segments, then
> the IO request described by this iov_iter will be split into multiple
> bios, which may cause potential deadlock for sync iopoll.
> 
> When it comes to sync iopoll, the bio is submitted without REQ_NOWAIT
> flag set and the process may hang in blk_mq_get_tag() if the input
> iov_iter has to be split into multiple bios and thus rapidly exhausts
> the queue depth. The process has to wait for the completion of the
> previously allocated requests, which should be done by the following
> sync polling, and thus causing a deadlock.
> 
> Actually there's subtle difference between the behaviour of handling
> HIPRI IO of blkdev and iomap, when the input iov_iter need to split
> into multiple bios. blkdev will set REQ_HIPRI for only the last split
> bio, leaving the previous bio queued into normal hardware queues, which
> will not cause the trouble described above though. iomap will set
> REQ_HIPRI for all bios split from one iov_iter, and thus may cause the
> potential deadlock decribed above.
> 
> Disable iopoll when one request need to be split into multiple bios.
> Though blkdev may not suffer the problem, still it may not make much
> sense to iopoll for big IO, since iopoll is initially for small size,
> latency sensitive IO.
> 
> Signed-off-by: Jeffle Xu <jefflexu@xxxxxxxxxxxxxxxxx>
> ---
>  fs/block_dev.c       | 7 +++++++
>  fs/iomap/direct-io.c | 9 ++++++++-
>  2 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 9e84b1928b94..a8a52cab15ab 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -491,6 +491,13 @@ blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>  	if (is_sync_kiocb(iocb) && nr_pages <= BIO_MAX_PAGES)
>  		return __blkdev_direct_IO_simple(iocb, iter, nr_pages);
>  
> +	/*
> +	 * IOpoll is initially for small size, latency sensitive IO.
> +	 * Disable iopoll if split needed.
> +	 */
> +	if (nr_pages > BIO_MAX_PAGES)
> +		iocb->ki_flags &= ~IOCB_HIPRI;

more pages than BIO_MAX_PAGES don't imply a split because we can
physically merge pages into a single vector (yes, BIO_MAX_PAGES is
utterly misnamed now).



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux