Re: [PATCH] io_uring: use task_work for links if possible

Pavel Begunkov <asml.silence@xxxxxxxxx> · Thu, 25 Jun 2020 23:28:11 +0300

On 25/06/2020 21:27, Jens Axboe wrote:
> Currently links are always done in an async fashion, unless we
> catch them inline after we successfully complete a request without
> having to resort to blocking. This isn't necessarily the most efficient
> approach, it'd be more ideal if we could just use the task_work handling
> for this.

Well, you beat me on this. As mentioned, I was going to rebase it after
lending iopoll fixes. Nice numbers! A small comment below, but LGTM.
I'll review more formally on a fresh head.

Could you push it to a branch? My other patches would conflict.

> 
> Outside of saving an async jump, we can also do less prep work for
> these kinds of requests.
> 
> Running dependent links from the task_work handler yields some nice
> performance benefits. As an example, examples/link-cp from the liburing
> repository uses read+write links to implement a copy operation. Without
> this patch, the a cache fold 4G file read from a VM runs in about
> 3 seconds:
> 
> $ time examples/link-cp /data/file /dev/null
> 
> real	0m2.986s
> user	0m0.051s
> sys	0m2.843s
> 
> and a subsequent cache hot run looks like this:
> 
> $ time examples/link-cp /data/file /dev/null
> 
> real	0m0.898s
> user	0m0.069s
> sys	0m0.797s
> 
> With this patch in place, the cold case takes about 2.4 seconds:
> 
> $ time examples/link-cp /data/file /dev/null
> 
> real	0m2.400s
> user	0m0.020s
> sys	0m2.366s
> 
> and the cache hot case looks like this:
> 
> $ time examples/link-cp /data/file /dev/null
> 
> real	0m0.676s
> user	0m0.010s
> sys	0m0.665s
> 
> As expected, the (mostly) cache hot case yields the biggest improvement,
> running about 25% faster with this change, while the cache cold case
> yields about a 20% increase in performance. Outside of the performance
> increase, we're using less CPU as well, as we're not using the async
> offload threads at all for this anymore.
> 
> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
> 
> ---
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 0bba12e4e559..389274a078c8 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
...
>  
> +static void io_sq_thread_drop_mm(struct io_ring_ctx *ctx)
> +{
> +	struct mm_struct *mm = current->mm;
> +
> +	if (mm) {
> +		kthread_unuse_mm(mm);
> +		mmput(mm);
> +	}
> +}
> +
> +static int io_sq_thread_acquire_mm(struct io_ring_ctx *ctx,
> +				   struct io_kiocb *req)
> +{
> +	if (io_op_defs[req->opcode].needs_mm && !current->mm) {
> +		if (unlikely(!mmget_not_zero(ctx->sqo_mm)))
> +			return -EFAULT;
> +		kthread_use_mm(ctx->sqo_mm);
> +	}
> +
> +	return 0;
> +}
...
> +static void __io_req_task_submit(struct io_kiocb *req)
> +{
> +	struct io_ring_ctx *ctx = req->ctx;
> +
> +	__set_current_state(TASK_RUNNING);
> +	if (!io_sq_thread_acquire_mm(ctx, req)) {

My last patch replaced it with "__" version. Is it merge problems
or intended as this?

> +		mutex_lock(&ctx->uring_lock);
> +		__io_queue_sqe(req, NULL, NULL);
> +		mutex_unlock(&ctx->uring_lock);
> +	} else {
> +		__io_req_task_cancel(req, -EFAULT);
> +	}
> +}
> +

-- 
Pavel Begunkov