On 2024-11-01 17:01, Samudrala, Sridhar wrote:
On 10/31/2024 11:39 PM, Joe Damato wrote:
On Thu, Oct 31, 2024 at 10:47:05PM -0500, Samudrala, Sridhar wrote:
On 10/31/2024 7:48 PM, Joe Damato wrote:
Describe irq suspension, the epoll ioctls, and the tradeoffs of using
different gro_flush_timeout values.
[...]
+To use this mechanism:
+
+ 1. The per-NAPI config parameter ``irq_suspend_timeout`` should
be set to the
+ maximum time (in nanoseconds) the application can have its IRQs
+ suspended. This is done using netlink, as described above.
This timeout
+ serves as a safety mechanism to restart IRQ driver interrupt
processing if
+ the application has stalled. This value should be chosen so
that it covers
+ the amount of time the user application needs to process data
from its
+ call to epoll_wait, noting that applications can control how
much data
+ they retrieve by setting ``max_events`` when calling epoll_wait.
+
+ 2. The sysfs parameter or per-NAPI config parameters
``gro_flush_timeout``
+ and ``napi_defer_hard_irqs`` can be set to low values. They
will be used
+ to defer IRQs after busy poll has found no data.
Is it required to set gro_flush_timeout and napi_defer_hard_irqs when
irq_suspend_timeout is set? Doesn't it override any smaller
gro_flush_timeout value?
It is not required to use gro_flush_timeout or napi_defer_hard_irqs,
but if they are set they will take over when epoll finds no events.
Their usage is recommended. See the Usage section of the cover
letter for details.
While gro_flush_timeout and napi_defer_hard_irqs are not strictly
required, it is difficult for the polling-based packet delivery loop
to gain control over packet delivery.
Please see a previous email about this from the RFC for more
details:
https://lore.kernel.org/netdev/2bb121dd-3dcd-4142-
ab87-02ccf4afd469@xxxxxxxxxxxx/
OK. Thanks for the clarification.
In the cover letter, you can note the difference in performance when
gro_flush_timeout is set to different values. Note the explanation
of suspendX; each suspend case is testing a different
gro_flush_timeout.
May be you can also include a test scenario in your perf results where
gro_flush_timeout and napi_defer_hard_irqs are not set to show that a
non-zero value of gro_flush_timeout and napi_defer_hard_irqs is
recommended when using irq_suspend_timeout.
Thanks for your feedback. We've updated the cover letter as well as the
kernel documentation to explain this in more detail and to illustrate
why the parameter usage is recommended. We ran experiments with these
parameters set to zero and the results are as expected and essentially
the same as the base case, i.e., irq_suspend_timeout does not have an
effect in this case.
Thanks,
Martin
Let us know if you have any other questions; both Martin and I are
happy to help or further explain anything that is not clear.