RE: [PATCH v2] block: don't allow multiple bios for IOCB_NOWAIT issue

"Michael Kelley (LINUX)" <mikelley@xxxxxxxxxxxxx> · Sat, 21 Jan 2023 04:56:20 +0000

From: Jens Axboe <axboe@xxxxxxxxx> Sent: Monday, January 16, 2023 1:06 PM
> 
> If we're doing a large IO request which needs to be split into multiple
> bios for issue, then we can run into the same situation as the below
> marked commit fixes - parts will complete just fine, one or more parts
> will fail to allocate a request. This will result in a partially
> completed read or write request, where the caller gets EAGAIN even though
> parts of the IO completed just fine.
> 
> Do the same for large bios as we do for splits - fail a NOWAIT request
> with EAGAIN. This isn't technically fixing an issue in the below marked
> patch, but for stable purposes, we should have either none of them or
> both.
> 
> This depends on: 613b14884b85 ("block: handle bio_split_to_limits() NULL return")
> 
> Cc: stable@xxxxxxxxxxxxxxx # 5.15+
> Fixes: 9cea62b2cbab ("block: don't allow splitting of a REQ_NOWAIT bio")
> Link: https://github.com/axboe/liburing/issues/766
> Reported-and-tested-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>
> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
> 
> ---
> 
> Since v1: catch this at submit time instead, since we can have various
> valid cases where the number of single page segments will not take a
> bio segment (page merging, huge pages).
> 
> diff --git a/block/fops.c b/block/fops.c
> index 50d245e8c913..d2e6be4e3d1c 100644
> --- a/block/fops.c
> +++ b/block/fops.c
> @@ -221,6 +221,24 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct
> iov_iter *iter,
>  			bio_endio(bio);
>  			break;
>  		}
> +		if (iocb->ki_flags & IOCB_NOWAIT) {
> +			/*
> +			 * This is nonblocking IO, and we need to allocate
> +			 * another bio if we have data left to map. As we
> +			 * cannot guarantee that one of the sub bios will not
> +			 * fail getting issued FOR NOWAIT and as error results
> +			 * are coalesced across all of them, be safe and ask for
> +			 * a retry of this from blocking context.
> +			 */
> +			if (unlikely(iov_iter_count(iter))) {
> +				bio_release_pages(bio, false);
> +				bio_clear_flag(bio, BIO_REFFED);
> +				bio_put(bio);
> +				blk_finish_plug(&plug);
> +				return -EAGAIN;
> +			}
> +			bio->bi_opf |= REQ_NOWAIT;
> +		}
> 
>  		if (is_read) {
>  			if (dio->flags & DIO_SHOULD_DIRTY)
> @@ -228,9 +246,6 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
>  		} else {
>  			task_io_account_write(bio->bi_iter.bi_size);
>  		}
> -		if (iocb->ki_flags & IOCB_NOWAIT)
> -			bio->bi_opf |= REQ_NOWAIT;
> -
>  		dio->size += bio->bi_iter.bi_size;
>  		pos += bio->bi_iter.bi_size;
> 

I've wrapped up my testing on this patch.  All testing was via io_uring -- I did
not test other paths.  Testing was against a combination of this patch and the
previous patch set for a similar problem. [1]

I tested with a simple test program to issue single I/Os, and verified the
expected paths were taken through the block layer and io_uring code for
various size I/Os, including over 1 Mbyte.  No EAGAIN errors were seen.
This testing was with a 6.1 kernel.

Also tested the original app that surfaced the problem.  It's a larger scale
workload using io_uring, and is where the problem was originally
encountered.  That workload runs on a purpose-built 5.15 kernel, so I
backported both patches to 5.15 for this testing.  All looks good.
No EAGAIN errors were seen.

Michael

[1] https://lore.kernel.org/linux-block/20230104160938.62636-1-axboe@xxxxxxxxx/