Re: [PATCH RFC 5.13 2/2] io_uring: submit sqes in the original context when waking up sqthread

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/28/21 3:34 PM, Pavel Begunkov wrote:
> On 4/28/21 2:32 PM, Hao Xu wrote:
>> sqes are submitted by sqthread when it is leveraged, which means there
>> is IO latency when waking up sqthread. To wipe it out, submit limited
>> number of sqes in the original task context.
>> Tests result below:
> 
> Frankly, it can be a nest of corner cases if not now then in the future,
> leading to a high maintenance burden. Hence, if we consider the change,
> I'd rather want to limit the userspace exposure, so it can be removed
> if needed.
> 
> A noticeable change of behaviour here, as Hao recently asked, is that
> the ring can be passed to a task from a completely another thread group,
> and so the feature would execute from that context, not from the
> original/sqpoll one.

So maybe something like:
if (same_thread_group()) {
	/* submit */
}

> 
> Not sure IORING_ENTER_SQ_DEPUTY knob is needed, but at least can be
> ignored if the previous point is addressed.

I'd question whether it'd be better with the flag or without doing
this feature by default.

> 
>>
>> 99th latency:
>> iops\idle	10us	60us	110us	160us	210us	260us	310us	360us	410us	460us	510us
>> with this patch:
>> 2k      	13	13	12	13	13	12	12	11	11	10.304	11.84
>> without this patch:
>> 2k      	15	14	15	15	15	14	15	14	14	13	11.84
> 
> Not sure the second nine describes it well enough, please can you
> add more data? Mean latency, 50%, 90%, 99%, 99.9%, t-put.
> 
> Btw, how happened that only some of the numbers have fractional part?
> Can't believe they all but 3 were close enough to integer values.
> 
>> fio config:
>> ./run_fio.sh
>> fio \
>> --ioengine=io_uring --sqthread_poll=1 --hipri=1 --thread=1 --bs=4k \
>> --direct=1 --rw=randread --time_based=1 --runtime=300 \
>> --group_reporting=1 --filename=/dev/nvme1n1 --sqthread_poll_cpu=30 \
>> --randrepeat=0 --cpus_allowed=35 --iodepth=128 --rate_iops=${1} \
>> --io_sq_thread_idle=${2}
>>
>> Signed-off-by: Hao Xu <haoxu@xxxxxxxxxxxxxxxxx>
>> ---
>>  fs/io_uring.c                 | 29 +++++++++++++++++++++++------
>>  include/uapi/linux/io_uring.h |  1 +
>>  2 files changed, 24 insertions(+), 6 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 1871fad48412..f0a01232671e 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -1252,7 +1252,12 @@ static void io_queue_async_work(struct io_kiocb *req)
>>  {
>>  	struct io_ring_ctx *ctx = req->ctx;
>>  	struct io_kiocb *link = io_prep_linked_timeout(req);
>> -	struct io_uring_task *tctx = req->task->io_uring;
>> +	struct io_uring_task *tctx = NULL;
>> +
>> +	if (ctx->sq_data && ctx->sq_data->thread)
>> +		tctx = ctx->sq_data->thread->io_uring;
> 
> without park it's racy, sq_data->thread may become NULL and removed,
> as well as its ->io_uring.
> 
>> +	else
>> +		tctx = req->task->io_uring;
>>  
>>  	BUG_ON(!tctx);
>>  	BUG_ON(!tctx->io_wq);
> 
> [snip]
> 

-- 
Pavel Begunkov



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux