Re: [GIT PULL] io_uring updates for 5.18-rc1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 2022-03-26 at 13:06 -0700, Jakub Kicinski wrote:
> On Sat, 26 Mar 2022 13:47:24 -0600 Jens Axboe wrote:
> 
> > Which constants are you referring to? Only odd one I see is
> > NAPI_TIMEOUT, other ones are using the sysctl bits. If we're
> > missing something here, do speak up and we'll make sure it's
> > consistent with the regular NAPI.
> 
> SO_BUSY_POLL_BUDGET, 8 is quite low for many practical uses.
> I'd also like to have a conversation about continuing to use
> the socket as a proxy for NAPI_ID, NAPI_ID is exposed to user
> space now. io_uring being a new interface I wonder if it's not 
> better to let the user specify the request parameters directly.
> 
> 
My napi busy poll integration is strongly inspired from epoll code as
its persistent context is much closer to the io_uring situation than
what select/poll code is doing.

For instance, BUSY_POLL_BUDGET constant is taken straight from epoll
code.

I am a little bit surprised about your questioning. If BUSY_POLL_BUDGET
is quite low for many practical uses, how is it ok to use for epoll
code?

If 8 is not a good default value, may I suggest that you change the
define value?

TBH, I didn't find the documentation about the busy poll budget
parameter so I did assume that existing code was doing the right
thing...

For your other suggestion, I do not think that it is a good idea to let
user specify the request napi ids to busy poll because it would make
the io_uring interface more complex without being even sure that this
is something that people want or need.

select/poll implementation examine each and every sockets on every call
and it can afford to do it since it is rebuilding the polling set every
time through sock_poll().

epoll code does not want to do that as it would defeat its purpose and
it relies on the busy poll global setting. Also, epoll code makes a
pretty bold assumption that its users desiring busy polling will be
willing to create an epoll set per receive queue and presumably run
each set in a dedicated thread.

In:
https://legacy.netdevconf.info/2.1/slides/apr6/dumazet-BUSY-POLLING-Netdev-2.1.pdf

Linux-4.12 changes:
epoll() support was added by Sridhar Samudrala and Alexander Duyck,
with the assumption that an application using epoll() and busy polling
would first make sure that it would classify sockets based on their
receive queue (NAPI ID), and use at least one epoll fd per receive
queue.

To me, this is a very big burden placed on the shoulders of their users
as not every application design can accomodate this requirement. For
instance, I have an Intel igb nic with 8 receive queues. I am not
running my app on a 24 cores machine where I can easily allocate 8
threads just for the networking I/O. I sincerely feel that epoll busy
poll integration has been specifically tailored for the patch authors
needs without all the usability concerns that you appear to have for
the io_uring implementation.

I went beyond what epoll offers by allowing the busy polling of several
receive queues from a single ring.

When I did mention my interest in a napi busy poll to the io_uring list
but I did not know how to proceed due to several unknown issues, Jens
did encourage to give it a shot and in that context, my design goal has
been to keep the initial implementation reasonable and simple.

One concession that could be done to address your concern, it is that
the socket receive queues added to the list of queues busy polled could
be further narrowed by using sk_can_busy_loop() instead of just
checking net_busy_loop_on().

Would that be a satisfactory compromise to you and your team?




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux