Re: [PATCH RFC 5.13 2/2] io_uring: submit sqes in the original context when waking up sqthread

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/28/21 2:32 PM, Hao Xu wrote:
> sqes are submitted by sqthread when it is leveraged, which means there
> is IO latency when waking up sqthread. To wipe it out, submit limited
> number of sqes in the original task context.
> Tests result below:

Frankly, it can be a nest of corner cases if not now then in the future,
leading to a high maintenance burden. Hence, if we consider the change,
I'd rather want to limit the userspace exposure, so it can be removed
if needed.

A noticeable change of behaviour here, as Hao recently asked, is that
the ring can be passed to a task from a completely another thread group,
and so the feature would execute from that context, not from the
original/sqpoll one.

Not sure IORING_ENTER_SQ_DEPUTY knob is needed, but at least can be
ignored if the previous point is addressed.

> 
> 99th latency:
> iops\idle	10us	60us	110us	160us	210us	260us	310us	360us	410us	460us	510us
> with this patch:
> 2k      	13	13	12	13	13	12	12	11	11	10.304	11.84
> without this patch:
> 2k      	15	14	15	15	15	14	15	14	14	13	11.84

Not sure the second nine describes it well enough, please can you
add more data? Mean latency, 50%, 90%, 99%, 99.9%, t-put.

Btw, how happened that only some of the numbers have fractional part?
Can't believe they all but 3 were close enough to integer values.

> fio config:
> ./run_fio.sh
> fio \
> --ioengine=io_uring --sqthread_poll=1 --hipri=1 --thread=1 --bs=4k \
> --direct=1 --rw=randread --time_based=1 --runtime=300 \
> --group_reporting=1 --filename=/dev/nvme1n1 --sqthread_poll_cpu=30 \
> --randrepeat=0 --cpus_allowed=35 --iodepth=128 --rate_iops=${1} \
> --io_sq_thread_idle=${2}
> 
> Signed-off-by: Hao Xu <haoxu@xxxxxxxxxxxxxxxxx>
> ---
>  fs/io_uring.c                 | 29 +++++++++++++++++++++++------
>  include/uapi/linux/io_uring.h |  1 +
>  2 files changed, 24 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 1871fad48412..f0a01232671e 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -1252,7 +1252,12 @@ static void io_queue_async_work(struct io_kiocb *req)
>  {
>  	struct io_ring_ctx *ctx = req->ctx;
>  	struct io_kiocb *link = io_prep_linked_timeout(req);
> -	struct io_uring_task *tctx = req->task->io_uring;
> +	struct io_uring_task *tctx = NULL;
> +
> +	if (ctx->sq_data && ctx->sq_data->thread)
> +		tctx = ctx->sq_data->thread->io_uring;

without park it's racy, sq_data->thread may become NULL and removed,
as well as its ->io_uring.

> +	else
> +		tctx = req->task->io_uring;
>  
>  	BUG_ON(!tctx);
>  	BUG_ON(!tctx->io_wq);

[snip]

-- 
Pavel Begunkov



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux