On 9/21/18 6:33 AM, Eric Dumazet wrote:
On 09/21/2018 12:17 AM, Song Liu wrote:
On Sep 20, 2018, at 4:49 PM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
On 09/20/2018 04:43 PM, Song Liu wrote:
I tried to totally skip ndo_poll_controller() here. It did avoid hitting
the issue. However, netpoll will drop (fail to send) more packets.
Why is it failing ?
If you are under high memory pressure, then maybe if you absolutely want memory to send
netpoll packets, you want to grab all NAPI contexts as a way to prevent other cpus
from feeding incoming packets to the host and add more memory pressure ;)
I did the test with Eric's latest patch (and disable ndo_poll_controller
in driver). The result didn't show significant increase in drop packets.
I guess packet drops in my earlier test was caused by some other changes
I mixed there.
So I think this patch does fix the issue. Thanks Eric!
Great, this is awesome.
I will prepare a patch series for net tree.
The core infrastructure is just better at being able to drain TX completions
without risking stealing the NAPI context forever.
should we remove ndo_poll_controller then?
My understanding that the patch helps by not letting
drivers do napi_schedule() for all queues into this_cpu, right?
But most of the drivers do exactly that in their ndo_poll_controller
implementations. Means most of the drivers will experience
this nasty behavior.