Re: [RFC net-next 0/5] Suspend IRQs during preferred busy poll

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024-08-14 15:53, Samiullah Khawaja wrote:
On Tue, Aug 13, 2024 at 6:19 AM Martin Karsten <mkarsten@xxxxxxxxxxxx> wrote:

On 2024-08-13 00:07, Stanislav Fomichev wrote:
On 08/12, Martin Karsten wrote:
On 2024-08-12 21:54, Stanislav Fomichev wrote:
On 08/12, Martin Karsten wrote:
On 2024-08-12 19:03, Stanislav Fomichev wrote:
On 08/12, Martin Karsten wrote:
On 2024-08-12 16:19, Stanislav Fomichev wrote:
On 08/12, Joe Damato wrote:
Greetings:

[snip]

Note that napi_suspend_irqs/napi_resume_irqs is needed even for the sake of
an individual queue or application to make sure that IRQ suspension is
enabled/disabled right away when the state of the system changes from busy
to idle and back.

Can we not handle everything in napi_busy_loop? If we can mark some napi
contexts as "explicitly polled by userspace with a larger defer timeout",
we should be able to do better compared to current NAPI_F_PREFER_BUSY_POLL
which is more like "this particular napi_poll call is user busy polling".

Then either the application needs to be polling all the time (wasting cpu
cycles) or latencies will be determined by the timeout.
But if I understand correctly, this means that if the application
thread that is supposed
to do napi busy polling gets busy doing work on the new data/events in
userspace, napi polling
will not be done until the suspend_timeout triggers? Do you dispatch
work to a separate worker
threads, in userspace, from the thread that is doing epoll_wait?

Yes, napi polling is suspended while the application is busy between epoll_wait calls. That's where the benefits are coming from.

The consequences depend on the nature of the application and overall preferences for the system. If there's a "dominant" application for a number of queues and cores, the resulting latency for other background applications using the same queues might not be a problem at all.

One other simple mitigation is limiting the number of events that each epoll_wait call accepts. Note that this batch size also determines the worst-case latency for the application in question, so there is a natural incentive to keep it limited.

A more complex application design, like you suggest, might also be an option.

Only when switching back and forth between polling and interrupts is it
possible to get low latencies across a large spectrum of offered loads
without burning cpu cycles at 100%.

Ah, I see what you're saying, yes, you're right. In this case ignore my comment
about ep_suspend_napi_irqs/napi_resume_irqs.

Thanks for probing and double-checking everything! Feedback is important
for us to properly document our proposal.

Let's see how other people feel about per-dev irq_suspend_timeout. Properly
disabling napi during busy polling is super useful, but it would still
be nice to plumb irq_suspend_timeout via epoll context or have it set on
a per-napi basis imho.
I agree, this would allow each napi queue to tune itself based on
heuristics. But I think
doing it through epoll independent interface makes more sense as Stan
suggested earlier.

The question is whether to add a useful mechanism (one sysfs parameter and a few lines of code) that is optional, but with demonstrable and significant performance/efficiency improvements for an important class of applications - or wait for an uncertain future?

Note that adding our mechanism in no way precludes switching the control parameters from per-device to per-napi as Joe alluded to earlier. In fact, it increases the incentive for doing so.

After working on this for quite a while, I am skeptical that anything fundamentally different could be done without re-architecting the entire napi control flow.

Thanks,
Martin





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux