Re: [xdp-hints] Re: [PATCH RFC bpf-next 32/52] bpf, cpumap: switch to GRO from netif_receive_skb_list()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Aug 09, Toke wrote:
> Alexander Lobakin <aleksander.lobakin@xxxxxxxxx> writes:
> 
> > From: Toke Høiland-Jørgensen <toke@xxxxxxxxxx>
> > Date: Fri, 09 Aug 2024 14:45:33 +0200
> >
> >> Alexander Lobakin <aleksander.lobakin@xxxxxxxxx> writes:
> >> 
> >>> From: Daniel Xu <dxu@xxxxxxxxx>
> >>> Date: Thu, 08 Aug 2024 16:52:51 -0400
> >>>
> >>>> Hi,
> >>>>
> >>>> On Thu, Aug 8, 2024, at 7:57 AM, Alexander Lobakin wrote:
> >>>>> From: Lorenzo Bianconi <lorenzo.bianconi@xxxxxxxxxx>
> >>>>> Date: Thu, 8 Aug 2024 06:54:06 +0200
> >>>>>
> >>>>>>> Hi Alexander,
> >>>>>>>
> >>>>>>> On Tue, Jun 28, 2022, at 12:47 PM, Alexander Lobakin wrote:
> >>>>>>>> cpumap has its own BH context based on kthread. It has a sane batch
> >>>>>>>> size of 8 frames per one cycle.
> >>>>>>>> GRO can be used on its own, adjust cpumap calls to the
> >>>>>>>> upper stack to use GRO API instead of netif_receive_skb_list() which
> >>>>>>>> processes skbs by batches, but doesn't involve GRO layer at all.
> >>>>>>>> It is most beneficial when a NIC which frame come from is XDP
> >>>>>>>> generic metadata-enabled, but in plenty of tests GRO performs better
> >>>>>>>> than listed receiving even given that it has to calculate full frame
> >>>>>>>> checksums on CPU.
> >>>>>>>> As GRO passes the skbs to the upper stack in the batches of
> >>>>>>>> @gro_normal_batch, i.e. 8 by default, and @skb->dev point to the
> >>>>>>>> device where the frame comes from, it is enough to disable GRO
> >>>>>>>> netdev feature on it to completely restore the original behaviour:
> >>>>>>>> untouched frames will be being bulked and passed to the upper stack
> >>>>>>>> by 8, as it was with netif_receive_skb_list().
> >>>>>>>>
> >>>>>>>> Signed-off-by: Alexander Lobakin <alexandr.lobakin@xxxxxxxxx>
> >>>>>>>> ---
> >>>>>>>>  kernel/bpf/cpumap.c | 43 ++++++++++++++++++++++++++++++++++++++-----
> >>>>>>>>  1 file changed, 38 insertions(+), 5 deletions(-)
> >>>>>>>>
> >>>>>>>
> >>>>>>> AFAICT the cpumap + GRO is a good standalone improvement. I think
> >>>>>>> cpumap is still missing this.
> >>>>>
> >>>>> The only concern for having GRO in cpumap without metadata from the NIC
> >>>>> descriptor was that when the checksum status is missing, GRO calculates
> >>>>> the checksum on CPU, which is not really fast.
> >>>>> But I remember sometimes GRO was faster despite that.
> >>>>
> >>>> Good to know, thanks. IIUC some kind of XDP hint support landed already?
> >>>>
> >>>> My use case could also use HW RSS hash to avoid a rehash in XDP prog.
> >>>
> >>> Unfortunately, for now it's impossible to get HW metadata such as RSS
> >>> hash and checksum status in cpumap. They're implemented via kfuncs
> >>> specific to a particular netdevice and this info is available only when
> >>> running XDP prog.
> >>>
> >>> But I think one solution could be:
> >>>
> >>> 1. We create some generic structure for cpumap, like
> >>>
> >>> struct cpumap_meta {
> >>> 	u32 magic;
> >>> 	u32 hash;
> >>> }
> >>>
> >>> 2. We add such check in the cpumap code
> >>>
> >>> 	if (xdpf->metalen == sizeof(struct cpumap_meta) &&
> >>> 	    <here we check magic>)
> >>> 		skb->hash = meta->hash;
> >>>
> >>> 3. In XDP prog, you call Rx hints kfuncs when they're available, obtain
> >>> RSS hash and then put it in the struct cpumap_meta as XDP frame metadata.
> >> 
> >> Yes, except don't make this cpumap-specific, make it generic for kernel
> >> consumption of the metadata. That way it doesn't even have to be stored
> >> in the xdp metadata area, it can be anywhere we want (and hence not
> >> subject to ABI issues), and we can use it for skb creation after
> >> redirect in other places than cpumap as well (say, on veth devices).
> >> 
> >> So it'll be:
> >> 
> >> struct kernel_meta {
> >> 	u32 hash;
> >> 	u32 timestamp;
> >>         ...etc
> >> }
> >> 
> >> and a kfunc:
> >> 
> >> void store_xdp_kernel_meta(struct kernel meta *meta);
> >> 
> >> which the XDP program can call to populate the metadata area.
> >
> > Hmm, nice!
> >
> > But where to store this info in case of cpumap if not in xdp->data_meta?
> > When you convert XDP frames to skbs in the cpumap code, you only have
> > &xdp_frame and that's it. XDP prog was already run earlier from the
> > driver code at that point.
> 
> Well, we could put it in skb_shared_info? IIRC, some of the metadata
> (timestamps?) end up there when building an skb anyway, so we won't even
> have to copy it around...

Before vacation I started looking into it a bit, I will resume this work in one
week or so.

Regards,
Lorenzo

> 
> -Toke
> 

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux