Re: IORING_OP_POLL_ADD slower than linux-aio IOCB_CMD_POLL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/19/22 1:41 PM, Avi Kivity wrote:
> 
> On 19/04/2022 20.14, Jens Axboe wrote:
>> On 4/19/22 9:21 AM, Jens Axboe wrote:
>>> On 4/19/22 6:31 AM, Jens Axboe wrote:
>>>> On 4/19/22 6:21 AM, Avi Kivity wrote:
>>>>> On 19/04/2022 15.04, Jens Axboe wrote:
>>>>>> On 4/19/22 5:57 AM, Avi Kivity wrote:
>>>>>>> On 19/04/2022 14.38, Jens Axboe wrote:
>>>>>>>> On 4/19/22 5:07 AM, Avi Kivity wrote:
>>>>>>>>> A simple webserver shows about 5% loss compared to linux-aio.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I expect the loss is due to an optimization that io_uring lacks -
>>>>>>>>> inline completion vs workqueue completion:
>>>>>>>> I don't think that's it, io_uring never punts to a workqueue for
>>>>>>>> completions.
>>>>>>> I measured this:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>    Performance counter stats for 'system wide':
>>>>>>>
>>>>>>>            1,273,756 io_uring:io_uring_task_add
>>>>>>>
>>>>>>>         12.288597765 seconds time elapsed
>>>>>>>
>>>>>>> Which exactly matches with the number of requests sent. If that's the
>>>>>>> wrong counter to measure, I'm happy to try again with the correct
>>>>>>> counter.
>>>>>> io_uring_task_add() isn't a workqueue, it's task_work. So that is
>>>>>> expected.
>>> Might actually be implicated. Not because it's a async worker, but
>>> because I think we might be losing some affinity in this case. Looking
>>> at traces, we're definitely bouncing between the poll completion side
>>> and then execution the completion.
>>>
>>> Can you try this hack? It's against -git + for-5.19/io_uring. If you let
>>> me know what base you prefer, I can do a version against that. I see
>>> about a 3% win with io_uring with this, and was slower before against
>>> linux-aio as you saw as well.
>> Another thing to try - get rid of the IPI for TWA_SIGNAL, which I
>> believe may be the underlying cause of it.
>>
> 
> Won't it delay notification until the next io_uring_enter? Or does
> io_uring only guarantee completions when you call it (and earlier
> completions are best-effort?)

Only if it needs to reschedule, it'll still enter the kernel if not. Or
if it's waiting in the kernel, it'll still run the task work as the
TIF_NOTIFY_SIGNAL will get that job done.

So actually not sure if we ever need the IPI, doesn't seem like we do.

> I'll try it tomorrow (also the other patch).

Thanks!

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux