On 1/17/23 08:39, Jens Axboe wrote: > On 1/16/23 4:31 PM, Damien Le Moal wrote: >> On 1/17/23 08:28, Jens Axboe wrote: >>> On 1/16/23 4:20?PM, Damien Le Moal wrote: >>>> On 1/17/23 06:06, Jens Axboe wrote: >>>>> If we're doing a large IO request which needs to be split into multiple >>>>> bios for issue, then we can run into the same situation as the below >>>>> marked commit fixes - parts will complete just fine, one or more parts >>>>> will fail to allocate a request. This will result in a partially >>>>> completed read or write request, where the caller gets EAGAIN even though >>>>> parts of the IO completed just fine. >>>>> >>>>> Do the same for large bios as we do for splits - fail a NOWAIT request >>>>> with EAGAIN. This isn't technically fixing an issue in the below marked >>>>> patch, but for stable purposes, we should have either none of them or >>>>> both. >>>>> >>>>> This depends on: 613b14884b85 ("block: handle bio_split_to_limits() NULL return") >>>>> >>>>> Cc: stable@xxxxxxxxxxxxxxx # 5.15+ >>>>> Fixes: 9cea62b2cbab ("block: don't allow splitting of a REQ_NOWAIT bio") >>>>> Link: https://github.com/axboe/liburing/issues/766 >>>>> Reported-and-tested-by: Michael Kelley <mikelley@xxxxxxxxxxxxx> >>>>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> >>>>> >>>>> --- >>>>> >>>>> Since v1: catch this at submit time instead, since we can have various >>>>> valid cases where the number of single page segments will not take a >>>>> bio segment (page merging, huge pages). >>>>> >>>>> diff --git a/block/fops.c b/block/fops.c >>>>> index 50d245e8c913..d2e6be4e3d1c 100644 >>>>> --- a/block/fops.c >>>>> +++ b/block/fops.c >>>>> @@ -221,6 +221,24 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, >>>>> bio_endio(bio); >>>>> break; >>>>> } >>>>> + if (iocb->ki_flags & IOCB_NOWAIT) { >>>>> + /* >>>>> + * This is nonblocking IO, and we need to allocate >>>>> + * another bio if we have data left to map. As we >>>>> + * cannot guarantee that one of the sub bios will not >>>>> + * fail getting issued FOR NOWAIT and as error results >>>>> + * are coalesced across all of them, be safe and ask for >>>>> + * a retry of this from blocking context. >>>>> + */ >>>>> + if (unlikely(iov_iter_count(iter))) { >>>>> + bio_release_pages(bio, false); >>>>> + bio_clear_flag(bio, BIO_REFFED); >>>>> + bio_put(bio); >>>>> + blk_finish_plug(&plug); >>>>> + return -EAGAIN; >>>> >>>> Doesn't this mean that for a really very large IO request that has 100% >>>> chance of being split, the user will always get -EAGAIN ? Not that I mind, >>>> doing super large IOs with NOWAIT is not a smart thing to do in the first >>>> place... But as a user interface, it seems that this will prevent any >>>> forward progress for such really large NOWAIT IOs. Is that OK ? >>> >>> Right, if you asked for NOWAIT, then it would not necessarily succeed if >>> it: >>> >>> 1) Needs multiple bios >>> 2) Needs splitting >>> >>> You're expected to attempt blocking issue at that point. Reasoning is >>> explained in this (and the previous commit related to the issue), >>> otherwise you end up with potentially various amounts of the request >>> being written to disk or read from disk, but EAGAIN being returned for >>> the request as a whole. >> >> Yes, I understood all that and completely agree with it. >> >> I was only wondering if this change may not surprise some (bad) userspace >> stuff. Do we need to update some man page or other doc, mentioning that >> there are no guarantees that a NOWAIT IO may actually be executed if it >> too large (e.g. larger than max_sectors_kb) ? > > We can certainly add it to the man pages talking about RWF_NOWAIT. But > there's never been a guarantee that any EAGAIN will later succeed > under the same conditions, and honestly there are various conditions > where this is already not true. And those same cases would spuriously > yield EAGAIN before already, it's not a new condition for those sizes > of IOs. OK. Thanks. -- Damien Le Moal Western Digital Research