From: Jens Axboe <axboe@xxxxxxxxx> Sent: Monday, January 16, 2023 1:06 PM > > If we're doing a large IO request which needs to be split into multiple > bios for issue, then we can run into the same situation as the below > marked commit fixes - parts will complete just fine, one or more parts > will fail to allocate a request. This will result in a partially > completed read or write request, where the caller gets EAGAIN even though > parts of the IO completed just fine. > > Do the same for large bios as we do for splits - fail a NOWAIT request > with EAGAIN. This isn't technically fixing an issue in the below marked > patch, but for stable purposes, we should have either none of them or > both. > > This depends on: 613b14884b85 ("block: handle bio_split_to_limits() NULL return") > > Cc: stable@xxxxxxxxxxxxxxx # 5.15+ > Fixes: 9cea62b2cbab ("block: don't allow splitting of a REQ_NOWAIT bio") > Link: https://github.com/axboe/liburing/issues/766 > Reported-and-tested-by: Michael Kelley <mikelley@xxxxxxxxxxxxx> > Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> > > --- > > Since v1: catch this at submit time instead, since we can have various > valid cases where the number of single page segments will not take a > bio segment (page merging, huge pages). > > diff --git a/block/fops.c b/block/fops.c > index 50d245e8c913..d2e6be4e3d1c 100644 > --- a/block/fops.c > +++ b/block/fops.c > @@ -221,6 +221,24 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct > iov_iter *iter, > bio_endio(bio); > break; > } > + if (iocb->ki_flags & IOCB_NOWAIT) { > + /* > + * This is nonblocking IO, and we need to allocate > + * another bio if we have data left to map. As we > + * cannot guarantee that one of the sub bios will not > + * fail getting issued FOR NOWAIT and as error results > + * are coalesced across all of them, be safe and ask for > + * a retry of this from blocking context. > + */ > + if (unlikely(iov_iter_count(iter))) { > + bio_release_pages(bio, false); > + bio_clear_flag(bio, BIO_REFFED); > + bio_put(bio); > + blk_finish_plug(&plug); > + return -EAGAIN; > + } > + bio->bi_opf |= REQ_NOWAIT; > + } > > if (is_read) { > if (dio->flags & DIO_SHOULD_DIRTY) > @@ -228,9 +246,6 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, > } else { > task_io_account_write(bio->bi_iter.bi_size); > } > - if (iocb->ki_flags & IOCB_NOWAIT) > - bio->bi_opf |= REQ_NOWAIT; > - > dio->size += bio->bi_iter.bi_size; > pos += bio->bi_iter.bi_size; > I've wrapped up my testing on this patch. All testing was via io_uring -- I did not test other paths. Testing was against a combination of this patch and the previous patch set for a similar problem. [1] I tested with a simple test program to issue single I/Os, and verified the expected paths were taken through the block layer and io_uring code for various size I/Os, including over 1 Mbyte. No EAGAIN errors were seen. This testing was with a 6.1 kernel. Also tested the original app that surfaced the problem. It's a larger scale workload using io_uring, and is where the problem was originally encountered. That workload runs on a purpose-built 5.15 kernel, so I backported both patches to 5.15 for this testing. All looks good. No EAGAIN errors were seen. Michael [1] https://lore.kernel.org/linux-block/20230104160938.62636-1-axboe@xxxxxxxxx/