Re: [RFC v2 2/3] io_uring: add fixed poll support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hi,

> On 10/28/21 6:28?AM, Xiaoguang Wang wrote:
>> Recently I spend time to research io_uring's fast-poll and multi-shot's
>> performance using network echo-server model. Previously I always thought
>> fast-poll is better than multi-shot and will give better performance,
>> but indeed multi-shot is almost always better than fast-poll in real
>> test, which is very interesting. I use ebpf to have some measurements,
>> it shows that whether fast-poll is excellent or not depends entirely on
>> that the first nowait try in io_issue_sqe() succeeds or fails. Take
>> io_recv operation as example(recv buffer is 16 bytes):
>>   1) the first nowait succeeds, a simple io_recv() is enough.
>> In my test machine, successful io_recv() consumes 1110ns averagely.
>>
>>   2) the first nowait fails, then we'll have some expensive work, which
>> contains failed io_revc(), apoll allocations, vfs_poll(), miscellaneous
>> initializations anc check in __io_arm_poll_handler() and a final
>> successful io_recv(). Among then:
>>     failed io_revc() consumes 620ns averagely.
>>     vfs_poll() consumes 550ns averagely.
>> I don't measure other overhead yet, but we can see if the first nowait
>> try fails, we'll need at least 2290ns(620 + 550 + 1110) to complete it.
>> In my echo server tests, 40% of first nowait io_recv() operations fails.
>>
>> From above measurements, it can explain why mulit-shot is better than
>> multi-shot, mulit-shot can ensure the first nowait try succeed.
>>
>> Based on above measurements, I try to improve fast-poll a bit:
>> Introduce fix poll support, currently it only works in file registered
>> mode. With this feature, we can get rid of various repeated operations
>> in io_arm_poll_handler(), contains apoll allocations, and miscellaneous
>> initializations anc check.
> I was toying with an idea on how to do persistent poll support,
> basically moving the wait_queue_entry out of io_poll and hence detaching
> it from the io_kiocb. That would allow a per-file (and type) poll entry
> to remain persistent in the kernel rather than needing to do this
> expensive work repeatedly. Pavel kindly reminded me of your work, which
> unfortunately I had totally forgotten.
>
> Did you end up taking this further? My idea was to make it work
> independently of fixed files, but I also don't want to reinvent the
> wheel if you ended up with something like this.
I haven't continued to work on this work since last patch set and
currently I don't have time for myself to continue working on this
job, sorry. It'll be great if we can add similar fixed poll for fast-poll
feature, or if we can eliminate the possible failed first no-wait submit
overhead. Recently, aone of our clients also wants to use asio(with
io_uring enabled), seems that asio(use io_uring fast-poll) does not
perform better than asio(epoll), I need to figure that out firstly.

asio: https://github.com/chriskohlhoff/asio.git


Regards,
Xiaoguang Wang
>




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux