Re: [PATCH v2] block: don't allow multiple bios for IOCB_NOWAIT issue

Damien Le Moal <damien.lemoal@xxxxxxxxxxxxxxxxxx> · Tue, 17 Jan 2023 11:17:07 +0900

On 1/17/23 08:39, Jens Axboe wrote:
> On 1/16/23 4:31 PM, Damien Le Moal wrote:
>> On 1/17/23 08:28, Jens Axboe wrote:
>>> On 1/16/23 4:20?PM, Damien Le Moal wrote:
>>>> On 1/17/23 06:06, Jens Axboe wrote:
>>>>> If we're doing a large IO request which needs to be split into multiple
>>>>> bios for issue, then we can run into the same situation as the below
>>>>> marked commit fixes - parts will complete just fine, one or more parts
>>>>> will fail to allocate a request. This will result in a partially
>>>>> completed read or write request, where the caller gets EAGAIN even though
>>>>> parts of the IO completed just fine.
>>>>>
>>>>> Do the same for large bios as we do for splits - fail a NOWAIT request
>>>>> with EAGAIN. This isn't technically fixing an issue in the below marked
>>>>> patch, but for stable purposes, we should have either none of them or
>>>>> both.
>>>>>
>>>>> This depends on: 613b14884b85 ("block: handle bio_split_to_limits() NULL return")
>>>>>
>>>>> Cc: stable@xxxxxxxxxxxxxxx # 5.15+
>>>>> Fixes: 9cea62b2cbab ("block: don't allow splitting of a REQ_NOWAIT bio")
>>>>> Link: https://github.com/axboe/liburing/issues/766
>>>>> Reported-and-tested-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>
>>>>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
>>>>>
>>>>> ---
>>>>>
>>>>> Since v1: catch this at submit time instead, since we can have various
>>>>> valid cases where the number of single page segments will not take a
>>>>> bio segment (page merging, huge pages).
>>>>>
>>>>> diff --git a/block/fops.c b/block/fops.c
>>>>> index 50d245e8c913..d2e6be4e3d1c 100644
>>>>> --- a/block/fops.c
>>>>> +++ b/block/fops.c
>>>>> @@ -221,6 +221,24 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
>>>>>  			bio_endio(bio);
>>>>>  			break;
>>>>>  		}
>>>>> +		if (iocb->ki_flags & IOCB_NOWAIT) {
>>>>> +			/*
>>>>> +			 * This is nonblocking IO, and we need to allocate
>>>>> +			 * another bio if we have data left to map. As we
>>>>> +			 * cannot guarantee that one of the sub bios will not
>>>>> +			 * fail getting issued FOR NOWAIT and as error results
>>>>> +			 * are coalesced across all of them, be safe and ask for
>>>>> +			 * a retry of this from blocking context.
>>>>> +			 */
>>>>> +			if (unlikely(iov_iter_count(iter))) {
>>>>> +				bio_release_pages(bio, false);
>>>>> +				bio_clear_flag(bio, BIO_REFFED);
>>>>> +				bio_put(bio);
>>>>> +				blk_finish_plug(&plug);
>>>>> +				return -EAGAIN;
>>>>
>>>> Doesn't this mean that for a really very large IO request that has 100%
>>>> chance of being split, the user will always get -EAGAIN ? Not that I mind,
>>>> doing super large IOs with NOWAIT is not a smart thing to do in the first
>>>> place... But as a user interface, it seems that this will prevent any
>>>> forward progress for such really large NOWAIT IOs. Is that OK ?
>>>
>>> Right, if you asked for NOWAIT, then it would not necessarily succeed if
>>> it:
>>>
>>> 1) Needs multiple bios
>>> 2) Needs splitting
>>>
>>> You're expected to attempt blocking issue at that point. Reasoning is
>>> explained in this (and the previous commit related to the issue),
>>> otherwise you end up with potentially various amounts of the request
>>> being written to disk or read from disk, but EAGAIN being returned for
>>> the request as a whole.
>>
>> Yes, I understood all that and completely agree with it.
>>
>> I was only wondering if this change may not surprise some (bad) userspace
>> stuff. Do we need to update some man page or other doc, mentioning that
>> there are no guarantees that a NOWAIT IO may actually be executed if it
>> too large (e.g. larger than max_sectors_kb) ?
> 
> We can certainly add it to the man pages talking about RWF_NOWAIT. But
> there's never been a guarantee that any EAGAIN will later succeed
> under the same conditions, and honestly there are various conditions
> where this is already not true. And those same cases would spuriously
> yield EAGAIN before already, it's not a new condition for those sizes
> of IOs.

OK. Thanks.

-- 
Damien Le Moal
Western Digital Research