Re: [xdp-hints] Re: [PATCH RFC bpf-next 32/52] bpf, cpumap: switch to GRO from netif_receive_skb_list()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alexander Lobakin <aleksander.lobakin@xxxxxxxxx> writes:

> From: Toke Høiland-Jørgensen <toke@xxxxxxxxxx>
> Date: Fri, 09 Aug 2024 14:45:33 +0200
>
>> Alexander Lobakin <aleksander.lobakin@xxxxxxxxx> writes:
>> 
>>> From: Daniel Xu <dxu@xxxxxxxxx>
>>> Date: Thu, 08 Aug 2024 16:52:51 -0400
>>>
>>>> Hi,
>>>>
>>>> On Thu, Aug 8, 2024, at 7:57 AM, Alexander Lobakin wrote:
>>>>> From: Lorenzo Bianconi <lorenzo.bianconi@xxxxxxxxxx>
>>>>> Date: Thu, 8 Aug 2024 06:54:06 +0200
>>>>>
>>>>>>> Hi Alexander,
>>>>>>>
>>>>>>> On Tue, Jun 28, 2022, at 12:47 PM, Alexander Lobakin wrote:
>>>>>>>> cpumap has its own BH context based on kthread. It has a sane batch
>>>>>>>> size of 8 frames per one cycle.
>>>>>>>> GRO can be used on its own, adjust cpumap calls to the
>>>>>>>> upper stack to use GRO API instead of netif_receive_skb_list() which
>>>>>>>> processes skbs by batches, but doesn't involve GRO layer at all.
>>>>>>>> It is most beneficial when a NIC which frame come from is XDP
>>>>>>>> generic metadata-enabled, but in plenty of tests GRO performs better
>>>>>>>> than listed receiving even given that it has to calculate full frame
>>>>>>>> checksums on CPU.
>>>>>>>> As GRO passes the skbs to the upper stack in the batches of
>>>>>>>> @gro_normal_batch, i.e. 8 by default, and @skb->dev point to the
>>>>>>>> device where the frame comes from, it is enough to disable GRO
>>>>>>>> netdev feature on it to completely restore the original behaviour:
>>>>>>>> untouched frames will be being bulked and passed to the upper stack
>>>>>>>> by 8, as it was with netif_receive_skb_list().
>>>>>>>>
>>>>>>>> Signed-off-by: Alexander Lobakin <alexandr.lobakin@xxxxxxxxx>
>>>>>>>> ---
>>>>>>>>  kernel/bpf/cpumap.c | 43 ++++++++++++++++++++++++++++++++++++++-----
>>>>>>>>  1 file changed, 38 insertions(+), 5 deletions(-)
>>>>>>>>
>>>>>>>
>>>>>>> AFAICT the cpumap + GRO is a good standalone improvement. I think
>>>>>>> cpumap is still missing this.
>>>>>
>>>>> The only concern for having GRO in cpumap without metadata from the NIC
>>>>> descriptor was that when the checksum status is missing, GRO calculates
>>>>> the checksum on CPU, which is not really fast.
>>>>> But I remember sometimes GRO was faster despite that.
>>>>
>>>> Good to know, thanks. IIUC some kind of XDP hint support landed already?
>>>>
>>>> My use case could also use HW RSS hash to avoid a rehash in XDP prog.
>>>
>>> Unfortunately, for now it's impossible to get HW metadata such as RSS
>>> hash and checksum status in cpumap. They're implemented via kfuncs
>>> specific to a particular netdevice and this info is available only when
>>> running XDP prog.
>>>
>>> But I think one solution could be:
>>>
>>> 1. We create some generic structure for cpumap, like
>>>
>>> struct cpumap_meta {
>>> 	u32 magic;
>>> 	u32 hash;
>>> }
>>>
>>> 2. We add such check in the cpumap code
>>>
>>> 	if (xdpf->metalen == sizeof(struct cpumap_meta) &&
>>> 	    <here we check magic>)
>>> 		skb->hash = meta->hash;
>>>
>>> 3. In XDP prog, you call Rx hints kfuncs when they're available, obtain
>>> RSS hash and then put it in the struct cpumap_meta as XDP frame metadata.
>> 
>> Yes, except don't make this cpumap-specific, make it generic for kernel
>> consumption of the metadata. That way it doesn't even have to be stored
>> in the xdp metadata area, it can be anywhere we want (and hence not
>> subject to ABI issues), and we can use it for skb creation after
>> redirect in other places than cpumap as well (say, on veth devices).
>> 
>> So it'll be:
>> 
>> struct kernel_meta {
>> 	u32 hash;
>> 	u32 timestamp;
>>         ...etc
>> }
>> 
>> and a kfunc:
>> 
>> void store_xdp_kernel_meta(struct kernel meta *meta);
>> 
>> which the XDP program can call to populate the metadata area.
>
> Hmm, nice!
>
> But where to store this info in case of cpumap if not in xdp->data_meta?
> When you convert XDP frames to skbs in the cpumap code, you only have
> &xdp_frame and that's it. XDP prog was already run earlier from the
> driver code at that point.

Well, we could put it in skb_shared_info? IIRC, some of the metadata
(timestamps?) end up there when building an skb anyway, so we won't even
have to copy it around...

-Toke






[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux