Re: [PATCH RFC 5.13 2/2] io_uring: submit sqes in the original context when waking up sqthread

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/28/21 7:32 AM, Hao Xu wrote:
> sqes are submitted by sqthread when it is leveraged, which means there
> is IO latency when waking up sqthread. To wipe it out, submit limited
> number of sqes in the original task context.
> Tests result below:
> 
> 99th latency:
> iops\idle	10us	60us	110us	160us	210us	260us	310us	360us	410us	460us	510us
> with this patch:
> 2k      	13	13	12	13	13	12	12	11	11	10.304	11.84
> without this patch:
> 2k      	15	14	15	15	15	14	15	14	14	13	11.84
> 
> fio config:
> ./run_fio.sh
> fio \
> --ioengine=io_uring --sqthread_poll=1 --hipri=1 --thread=1 --bs=4k \
> --direct=1 --rw=randread --time_based=1 --runtime=300 \
> --group_reporting=1 --filename=/dev/nvme1n1 --sqthread_poll_cpu=30 \
> --randrepeat=0 --cpus_allowed=35 --iodepth=128 --rate_iops=${1} \
> --io_sq_thread_idle=${2}

Interesting concept! One question:

> @@ -9304,8 +9311,18 @@ static int io_get_ext_arg(unsigned flags, const void __user *argp, size_t *argsz
>  		if (unlikely(ctx->sq_data->thread == NULL)) {
>  			goto out;
>  		}
> -		if (flags & IORING_ENTER_SQ_WAKEUP)
> +		if (flags & IORING_ENTER_SQ_WAKEUP) {
>  			wake_up(&ctx->sq_data->wait);
> +			if ((flags & IORING_ENTER_SQ_DEPUTY) &&
> +					!(ctx->flags & IORING_SETUP_IOPOLL)) {
> +				ret = io_uring_add_task_file(ctx);
> +				if (unlikely(ret))
> +					goto out;
> +				mutex_lock(&ctx->uring_lock);
> +				io_submit_sqes(ctx, min(to_submit, 8U));
> +				mutex_unlock(&ctx->uring_lock);
> +			}
> +		}

Do we want to wake the sqpoll thread _post_ submitting these ios? The
idea being that if we're submitting now after a while (since the thread
is sleeping), then we're most likely going to be submitting more than
just this single batch. And the wakeup would do the same if done after
the submit, it'd just not interfere with this submit. You could imagine
a scenario where we do the wake and the sqpoll thread beats us to the
submit, and now we're just stuck waiting for the uring_lock and end up
doing nothing.

Maybe you guys already tested this? Also curious if you did, what kind
of requests are being submitted? That can have quite a bit of effect on
how quickly the submit is done.

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux