Alexander Lobakin <aleksander.lobakin@xxxxxxxxx> writes: > cpumap has its own BH context based on kthread. It has a sane batch > size of 8 frames per one cycle. > GRO can be used here on its own. Adjust cpumap calls to the upper stack > to use GRO API instead of netif_receive_skb_list() which processes skbs > by batches, but doesn't involve GRO layer at all. > In plenty of tests, GRO performs better than listed receiving even > given that it has to calculate full frame checksums on the CPU. > As GRO passes the skbs to the upper stack in the batches of > @gro_normal_batch, i.e. 8 by default, and skb->dev points to the > device where the frame comes from, it is enough to disable GRO > netdev feature on it to completely restore the original behaviour: > untouched frames will be being bulked and passed to the upper stack > by 8, as it was with netif_receive_skb_list(). > > Signed-off-by: Alexander Lobakin <aleksander.lobakin@xxxxxxxxx> > Tested-by: Daniel Xu <dxu@xxxxxxxxx> Reviewed-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx>