On 1/20/23 9:56?PM, Michael Kelley (LINUX) wrote: > From: Jens Axboe <axboe@xxxxxxxxx> Sent: Monday, January 16, 2023 1:06 PM >> >> If we're doing a large IO request which needs to be split into multiple >> bios for issue, then we can run into the same situation as the below >> marked commit fixes - parts will complete just fine, one or more parts >> will fail to allocate a request. This will result in a partially >> completed read or write request, where the caller gets EAGAIN even though >> parts of the IO completed just fine. >> >> Do the same for large bios as we do for splits - fail a NOWAIT request >> with EAGAIN. This isn't technically fixing an issue in the below marked >> patch, but for stable purposes, we should have either none of them or >> both. >> >> This depends on: 613b14884b85 ("block: handle bio_split_to_limits() NULL return") >> >> Cc: stable@xxxxxxxxxxxxxxx # 5.15+ >> Fixes: 9cea62b2cbab ("block: don't allow splitting of a REQ_NOWAIT bio") >> Link: https://github.com/axboe/liburing/issues/766 >> Reported-and-tested-by: Michael Kelley <mikelley@xxxxxxxxxxxxx> >> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> >> >> --- >> >> Since v1: catch this at submit time instead, since we can have various >> valid cases where the number of single page segments will not take a >> bio segment (page merging, huge pages). >> >> diff --git a/block/fops.c b/block/fops.c >> index 50d245e8c913..d2e6be4e3d1c 100644 >> --- a/block/fops.c >> +++ b/block/fops.c >> @@ -221,6 +221,24 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct >> iov_iter *iter, >> bio_endio(bio); >> break; >> } >> + if (iocb->ki_flags & IOCB_NOWAIT) { >> + /* >> + * This is nonblocking IO, and we need to allocate >> + * another bio if we have data left to map. As we >> + * cannot guarantee that one of the sub bios will not >> + * fail getting issued FOR NOWAIT and as error results >> + * are coalesced across all of them, be safe and ask for >> + * a retry of this from blocking context. >> + */ >> + if (unlikely(iov_iter_count(iter))) { >> + bio_release_pages(bio, false); >> + bio_clear_flag(bio, BIO_REFFED); >> + bio_put(bio); >> + blk_finish_plug(&plug); >> + return -EAGAIN; >> + } >> + bio->bi_opf |= REQ_NOWAIT; >> + } >> >> if (is_read) { >> if (dio->flags & DIO_SHOULD_DIRTY) >> @@ -228,9 +246,6 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, >> } else { >> task_io_account_write(bio->bi_iter.bi_size); >> } >> - if (iocb->ki_flags & IOCB_NOWAIT) >> - bio->bi_opf |= REQ_NOWAIT; >> - >> dio->size += bio->bi_iter.bi_size; >> pos += bio->bi_iter.bi_size; >> > > I've wrapped up my testing on this patch. All testing was via > io_uring -- I did not test other paths. Testing was against a > combination of this patch and the previous patch set for a similar > problem. [1] > > I tested with a simple test program to issue single I/Os, and verified > the expected paths were taken through the block layer and io_uring > code for various size I/Os, including over 1 Mbyte. No EAGAIN errors > were seen. This testing was with a 6.1 kernel. > > Also tested the original app that surfaced the problem. It's a larger > scale workload using io_uring, and is where the problem was originally > encountered. That workload runs on a purpose-built 5.15 kernel, so I > backported both patches to 5.15 for this testing. All looks good. No > EAGAIN errors were seen. Thanks a lot for your thorough testing! Can you share the 5.15 backports, so we can put them into 5.15-stable as well potentially? -- Jens Axboe