On Wed, 6 Sep 2023 11:11:48 +0800 Hayes Wang wrote: > Stop submitting rx, if the driver queue more than 256 packets. > > If the hardware is more fast than the software, the driver would start > queuing the packets. And, the driver starts dropping the packets, if it > queues more than 1000 packets. > > Increase the weight of NAPI could improve the situation. However, the > weight has been changed to 64, so we have to stop submitting rx when the > driver queues too many packets. Then, the device may send the pause frame > to slow down the receiving, when the FIFO of the device is full. Good to see that you can repro the problem. Before we tweak the heuristics let's make sure rx_bottom() behaves correctly. Could you make sure that - we don't perform _any_ rx processing when budget is 0 (see the NAPI documentation under Documentation/networking) - finish the current aggregate even if budget run out, return work_done = budget in that case. With this change the rx_queue thing should be gone completely. - instead of copying the head use napi_get_frags() + napi_gro_frags() it gives you an skb, you just attach the page to it as a frag and hand it back to GRO. This makes sure you never pull data into head rather than just headers. Please share the performance results with those changes. -- pw-bot: cr