Re: [xdp-hints] Re: [Intel-wired-lan] [PATCH bpf-next V1] igc: enable and fix RX hash usage by netstack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Jesper Dangaard Brouer <jbrouer@xxxxxxxxxx>
Date: Thu, 16 Feb 2023 16:17:46 +0100

> 
> On 14/02/2023 16.13, Alexander Lobakin wrote:
>> From: Paul Menzel <pmenzel@xxxxxxxxxxxxx>
>> Date: Tue, 14 Feb 2023 16:00:52 +0100
>>>
>>> Am 10.02.23 um 16:07 schrieb Jesper Dangaard Brouer:
>>>> When function igc_rx_hash() was introduced in v4.20 via commit
>>>> 0507ef8a0372
>>>> ("igc: Add transmit and receive fastpath and interrupt handlers"), the
>>>> hardware wasn't configured to provide RSS hash, thus it made sense
>>>> to not
>>>> enable net_device NETIF_F_RXHASH feature bit.
>>>>
> [...]
>>>
>>>> hash value doesn't include UDP port numbers. Not being
>>>> PKT_HASH_TYPE_L4, have
>>>> the effect that netstack will do a software based hash calc calling
>>>> into
>>>> flow_dissect, but only when code calls skb_get_hash(), which doesn't
>>>> necessary happen for local delivery.
>>>
>>> Excuse my ignorance, but is that bug visible in practice by users
>>> (performance?) or is that fix needed for future work?
>>
>> Hash calculation always happens when RPS or RFS is enabled. So having no
>> hash in skb before hitting the netstack slows down their performance.
>> Also, no hash in skb passed from the driver results in worse NAPI bucket
>> distribution when there are more traffic flows than Rx queues / CPUs.
>> + Netfilter needs hashes on some configurations.
>>
> 
> Thanks Olek for explaining that.

<O

> 
> My perf measurements show that the expensive part is that netstack will
> call the flow_dissector code, when the hardware RX-hash is missing.

Well, not always, but right, the skb_get_hash() family is used widely
across the netstack, so it's highly recommended to have hardware hash
filled in skbs, same as with checksums, to avoid wasting CPU on
computing them in software.
And the Flow Dissector is expensive by its nature, a bunch faster when
you attach a BPF prog to it, but still (not that I support P4, I don't
at all).

> 
>>>
>>>> Fixes: 2121c2712f82 ("igc: Add multiple receive queues control
>>>> supporting")
>>>> Signed-off-by: Jesper Dangaard Brouer <brouer@xxxxxxxxxx>
>>
>> [...]
>>
>> Nice to see that you also care about (not) using short types on the
>> stack :)
> 
> As can be seen by godbolt.org exploration[0] I have done, the stack
> isn't used for storing the values.
> 
>  [0]
> https://github.com/xdp-project/xdp-project/tree/master/areas/hints/godbolt/
> 
> I have created three files[2] with C-code that can be compiled via
> https://godbolt.org/. ; The C-code contains a comment with the ASM code
> that was generated with -02 with compiler x86-64 gcc 12.2.
> 
> The first file[01] corresponds to this patch.
> 
>  [01]
> https://github.com/xdp-project/xdp-project/blob/master/areas/hints/godbolt/igc_godbolt01.c
>  [G01] https://godbolt.org/z/j79M9aTsn
> 
> The second file igc_godbolt02.c [02] have changes in [diff02]
> 
>  [02]
> https://github.com/xdp-project/xdp-project/blob/master/areas/hints/godbolt/igc_godbolt02.c
>  [G02] https://godbolt.org/z/sErqe4qd5
>  [diff02] https://github.com/xdp-project/xdp-project/commit/1f3488a932767
> 
> The third file igc_godbolt03.c [03] have changes in [diff03]
> 
>  [03]
> https://github.com/xdp-project/xdp-project/blob/master/areas/hints/godbolt/igc_godbolt03.c
>  [G03] https://godbolt.org/z/5K3vE1Wsv
>  [diff03] https://github.com/xdp-project/xdp-project/commit/aa9298f68705
> 
> Summary, the only thing we can save is replacing some movzx
> (zero-extend) with mov instructions.

Good stuff, thanks! When I call to not use short types on the stack, the
only thing I care about is the resulting object code, not simple "just
don't use it, I said so". So when a developer inspects the results from
using one or another type, he's free in picking whatever he wants if it
doesn't hurt optimization.

[...]

Olek



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux