Re: [PATCH 7/9] io_uring: add per-task callback handler

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Fri, 21 Feb 2020 17:23:54 +0100

On Fri, Feb 21, 2020 at 06:49:16AM -0800, Jens Axboe wrote:

> > Jens, what exactly is the benefit of running this on every random
> > schedule() vs in io_cqring_wait() ? Or even, since io_cqring_wait() is
> > the very last thing the syscall does, task_work.
> 
> I took a step back and I think we can just use the task work, which
> makes this a lot less complicated in terms of locking and schedule
> state. Ran some quick testing with the below and it works for me.
> 
> I'm going to re-spin based on this and just dump the sched_work
> addition.

Aswesome, simpler is better.

> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 81aa3959f326..413ac86d7882 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -3529,7 +3529,7 @@ static int __io_async_wake(struct io_kiocb *req, struct io_poll_iocb *poll,
>  	 * the exit check will ultimately cancel these work items. Hence we
>  	 * don't need to check here and handle it specifically.
>  	 */
> -	sched_work_add(tsk, &req->sched_work);
> +	task_work_add(tsk, &req->sched_work, true);
>  	wake_up_process(tsk);
>  	return 1;
>  }
> @@ -5367,9 +5367,9 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
>  	do {
>  		if (io_cqring_events(ctx, false) >= min_events)
>  			return 0;
> -		if (!current->sched_work)
> +		if (!current->task_works)
>  			break;
> -		sched_work_run();
> +		task_work_run();
>  	} while (1);
>  
>  	if (sig) {
> @@ -5392,6 +5392,12 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
>  						TASK_INTERRUPTIBLE);
>  		if (io_should_wake(&iowq, false))
>  			break;
> +		if (current->task_works) {
> +			task_work_run();
> +			if (io_should_wake(&iowq, false))
> +				break;
> +			continue;
> +		}

		if (current->task_works)
			task_work_run();
		if (io_should_wake(&iowq, false);
			break;

doesn't work?

>  		schedule();
>  		if (signal_pending(current)) {
>  			ret = -EINTR;

Anyway, we need to be careful about the context where we call
task_work_run(), but afaict doing it here should be fine.