On 1/17/23 08:28, Jens Axboe wrote: > On 1/16/23 4:20?PM, Damien Le Moal wrote: >> On 1/17/23 06:06, Jens Axboe wrote: >>> If we're doing a large IO request which needs to be split into multiple >>> bios for issue, then we can run into the same situation as the below >>> marked commit fixes - parts will complete just fine, one or more parts >>> will fail to allocate a request. This will result in a partially >>> completed read or write request, where the caller gets EAGAIN even though >>> parts of the IO completed just fine. >>> >>> Do the same for large bios as we do for splits - fail a NOWAIT request >>> with EAGAIN. This isn't technically fixing an issue in the below marked >>> patch, but for stable purposes, we should have either none of them or >>> both. >>> >>> This depends on: 613b14884b85 ("block: handle bio_split_to_limits() NULL return") >>> >>> Cc: stable@xxxxxxxxxxxxxxx # 5.15+ >>> Fixes: 9cea62b2cbab ("block: don't allow splitting of a REQ_NOWAIT bio") >>> Link: https://github.com/axboe/liburing/issues/766 >>> Reported-and-tested-by: Michael Kelley <mikelley@xxxxxxxxxxxxx> >>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> >>> >>> --- >>> >>> Since v1: catch this at submit time instead, since we can have various >>> valid cases where the number of single page segments will not take a >>> bio segment (page merging, huge pages). >>> >>> diff --git a/block/fops.c b/block/fops.c >>> index 50d245e8c913..d2e6be4e3d1c 100644 >>> --- a/block/fops.c >>> +++ b/block/fops.c >>> @@ -221,6 +221,24 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, >>> bio_endio(bio); >>> break; >>> } >>> + if (iocb->ki_flags & IOCB_NOWAIT) { >>> + /* >>> + * This is nonblocking IO, and we need to allocate >>> + * another bio if we have data left to map. As we >>> + * cannot guarantee that one of the sub bios will not >>> + * fail getting issued FOR NOWAIT and as error results >>> + * are coalesced across all of them, be safe and ask for >>> + * a retry of this from blocking context. >>> + */ >>> + if (unlikely(iov_iter_count(iter))) { >>> + bio_release_pages(bio, false); >>> + bio_clear_flag(bio, BIO_REFFED); >>> + bio_put(bio); >>> + blk_finish_plug(&plug); >>> + return -EAGAIN; >> >> Doesn't this mean that for a really very large IO request that has 100% >> chance of being split, the user will always get -EAGAIN ? Not that I mind, >> doing super large IOs with NOWAIT is not a smart thing to do in the first >> place... But as a user interface, it seems that this will prevent any >> forward progress for such really large NOWAIT IOs. Is that OK ? > > Right, if you asked for NOWAIT, then it would not necessarily succeed if > it: > > 1) Needs multiple bios > 2) Needs splitting > > You're expected to attempt blocking issue at that point. Reasoning is > explained in this (and the previous commit related to the issue), > otherwise you end up with potentially various amounts of the request > being written to disk or read from disk, but EAGAIN being returned for > the request as a whole. Yes, I understood all that and completely agree with it. I was only wondering if this change may not surprise some (bad) userspace stuff. Do we need to update some man page or other doc, mentioning that there are no guarantees that a NOWAIT IO may actually be executed if it too large (e.g. larger than max_sectors_kb) ? -- Damien Le Moal Western Digital Research