Re: [PATCH 12/12] io_uring: support true async buffered reads, if file provides it

Jens Axboe <axboe@xxxxxxxxx> · Tue, 26 May 2020 07:50:49 -0600

On 5/26/20 1:44 AM, Pavel Begunkov wrote:
> On 25/05/2020 22:59, Jens Axboe wrote:
>> On 5/25/20 1:29 AM, Pavel Begunkov wrote:
>>> On 23/05/2020 21:57, Jens Axboe wrote:
>>>> If the file is flagged with FMODE_BUF_RASYNC, then we don't have to punt
>>>> the buffered read to an io-wq worker. Instead we can rely on page
>>>> unlocking callbacks to support retry based async IO. This is a lot more
>>>> efficient than doing async thread offload.
>>>>
>>>> The retry is done similarly to how we handle poll based retry. From
>>>> the unlock callback, we simply queue the retry to a task_work based
>>>> handler.
>>>>
>>>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
>>>> ---
>>>>  fs/io_uring.c | 99 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  1 file changed, 99 insertions(+)
>>>>
>>> ...
>>>> +
>>>> +	init_task_work(&rw->task_work, io_async_buf_retry);
>>>> +	/* submit ref gets dropped, acquire a new one */
>>>> +	refcount_inc(&req->refs);
>>>> +	tsk = req->task;
>>>> +	ret = task_work_add(tsk, &rw->task_work, true);
>>>> +	if (unlikely(ret)) {
>>>> +		/* queue just for cancelation */
>>>> +		init_task_work(&rw->task_work, io_async_buf_cancel);
>>>> +		tsk = io_wq_get_task(req->ctx->io_wq);
>>>
>>> IIRC, task will be put somewhere around io_free_req(). Then shouldn't here be
>>> some juggling with reassigning req->task with task_{get,put}()?
>>
>> Not sure I follow? Yes, we'll put this task again when the request
>> is freed, but not sure what you mean with juggling?
> 
> I meant something like:
> 
> ...
> /* queue just for cancelation */
> init_task_work(&rw->task_work, io_async_buf_cancel);
> + put_task_struct(req->task);
> + req->task = get_task_struct(io_wq_task);
> 
> 
> but, thinking twice, if I got the whole idea right, it should be ok as
> is -- io-wq won't go away before the request anyway, and leaving
> req->task pinned down for a bit is not a problem.

OK good, then I thin kwe agree it's fine.

>>>> +		task_work_add(tsk, &rw->task_work, true);
>>>> +	}
>>>> +	wake_up_process(tsk);
>>>> +	return 1;
>>>> +}
>>> ...
>>>>  static int io_read(struct io_kiocb *req, bool force_nonblock)
>>>>  {
>>>>  	struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs;
>>>> @@ -2601,6 +2696,7 @@ static int io_read(struct io_kiocb *req, bool force_nonblock)
>>>>  	if (!ret) {
>>>>  		ssize_t ret2;
>>>>  
>>>> +retry:
>>>>  		if (req->file->f_op->read_iter)
>>>>  			ret2 = call_read_iter(req->file, kiocb, &iter);
>>>>  		else
>>>> @@ -2619,6 +2715,9 @@ static int io_read(struct io_kiocb *req, bool force_nonblock)
>>>>  			if (!(req->flags & REQ_F_NOWAIT) &&
>>>>  			    !file_can_poll(req->file))
>>>>  				req->flags |= REQ_F_MUST_PUNT;
>>>> +			if (io_rw_should_retry(req))
>>>
>>> It looks like a state machine with IOCB_WAITQ and gotos. Wouldn't it be cleaner
>>> to call call_read_iter()/loop_rw_iter() here directly instead of "goto retry" ?
>>
>> We could, probably making that part a separate helper then. How about the
>> below incremental?
> 
> IMHO, it was easy to get lost with such implicit state switching.
> Looks better now! See a small comment below.

Agree, that is cleaner.

>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index a5a4d9602915..669dccd81207 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -2677,6 +2677,13 @@ static bool io_rw_should_retry(struct io_kiocb *req)
>>  	return false;
>>  }
>>  
>> +static int __io_read(struct io_kiocb *req, struct iov_iter *iter)
>> +{
>> +	if (req->file->f_op->read_iter)
>> +		return call_read_iter(req->file, &req->rw.kiocb, iter);
>> +	return loop_rw_iter(READ, req->file, &req->rw.kiocb, iter);
>> +}
>> +
>>  static int io_read(struct io_kiocb *req, bool force_nonblock)
>>  {
>>  	struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs;
>> @@ -2710,11 +2717,7 @@ static int io_read(struct io_kiocb *req, bool force_nonblock)
>>  	if (!ret) {
>>  		ssize_t ret2;
>>  
>> -retry:
>> -		if (req->file->f_op->read_iter)
>> -			ret2 = call_read_iter(req->file, kiocb, &iter);
>> -		else
>> -			ret2 = loop_rw_iter(READ, req->file, kiocb, &iter);
>> +		ret2 = __io_read(req, &iter);
>>  
>>  		/* Catch -EAGAIN return for forced non-blocking submission */
>>  		if (!force_nonblock || ret2 != -EAGAIN) {
>> @@ -2729,8 +2732,11 @@ static int io_read(struct io_kiocb *req, bool force_nonblock)
>>  			if (!(req->flags & REQ_F_NOWAIT) &&
>>  			    !file_can_poll(req->file))
>>  				req->flags |= REQ_F_MUST_PUNT;
>> -			if (io_rw_should_retry(req))
>> -				goto retry;
>> +			if (io_rw_should_retry(req)) {
>> +				ret2 = __io_read(req, &iter);
>> +				if (ret2 != -EAGAIN)
>> +					goto out_free;
> 
> "goto out_free" returns ret=0, so someone should add a cqe
> 
> if (ret2 != -EAGAIN) {
> 	kiocb_done(kiocb, ret2);
> 	goto free_out;
> }

Fixed up in the current one.

-- 
Jens Axboe