> -----Original Message----- > From: Daniel Borkmann <daniel@xxxxxxxxxxxxx> > Sent: Friday, October 2, 2020 10:53 PM > To: John Fastabend <john.fastabend@xxxxxxxxx>; Lorenzo Bianconi > <lorenzo@xxxxxxxxxx>; bpf@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx > Cc: davem@xxxxxxxxxxxxx; kuba@xxxxxxxxxx; ast@xxxxxxxxxx; Agroskin, > Shay <shayagr@xxxxxxxxxx>; Jubran, Samih <sameehj@xxxxxxxxxx>; > dsahern@xxxxxxxxxx; brouer@xxxxxxxxxx; lorenzo.bianconi@xxxxxxxxxx; > echaudro@xxxxxxxxxx > Subject: RE: [EXTERNAL] [PATCH v4 bpf-next 00/13] mvneta: introduce XDP > multi-buffer support > > CAUTION: This email originated from outside of the organization. Do not click > links or open attachments unless you can confirm the sender and know the > content is safe. > > > > On 10/2/20 5:25 PM, John Fastabend wrote: > > Lorenzo Bianconi wrote: > >> This series introduce XDP multi-buffer support. The mvneta driver is > >> the first to support these new "non-linear" xdp_{buff,frame}. > >> Reviewers please focus on how these new types of xdp_{buff,frame} > >> packets traverse the different layers and the layout design. It is on > >> purpose that BPF-helpers are kept simple, as we don't want to expose > >> the internal layout to allow later changes. > >> > >> For now, to keep the design simple and to maintain performance, the > >> XDP BPF-prog (still) only have access to the first-buffer. It is left > >> for later (another patchset) to add payload access across multiple buffers. > >> This patchset should still allow for these future extensions. The > >> goal is to lift the XDP MTU restriction that comes with XDP, but > >> maintain same performance as before. > >> > >> The main idea for the new multi-buffer layout is to reuse the same > >> layout used for non-linear SKB. This rely on the "skb_shared_info" > >> struct at the end of the first buffer to link together subsequent > >> buffers. Keeping the layout compatible with SKBs is also done to ease > >> and speedup creating an SKB from an xdp_{buff,frame}. Converting > >> xdp_frame to SKB and deliver it to the network stack is shown in > >> cpumap code (patch 13/13). > > > > Using the end of the buffer for the skb_shared_info struct is going to > > become driver API so unwinding it if it proves to be a performance > > issue is going to be ugly. So same question as before, for the use > > case where we receive packet and do XDP_TX with it how do we avoid > > cache miss overhead? This is not just a hypothetical use case, the > > Facebook load balancer is doing this as well as Cilium and allowing > > this with multi-buffer packets >1500B would be useful. > [...] > > Fully agree. My other question would be if someone else right now is in the > process of implementing this scheme for a 40G+ NIC? My concern is the > numbers below are rather on the lower end of the spectrum, so I would like > to see a comparison of XDP as-is today vs XDP multi-buff on a higher end NIC > so that we have a picture how well the current designed scheme works there > and into which performance issue we'll run e.g. > under typical XDP L4 load balancer scenario with XDP_TX. I think this would > be crucial before the driver API becomes 'sort of' set in stone where others > start to adapting it and changing design becomes painful. Do ena folks have > an implementation ready as well? And what about virtio_net, for example, > anyone committing there too? Typically for such features to land is to require > at least 2 drivers implementing it. > We (ENA) expect to have XDP MB implementation with performance results in around 4-6 weeks. > >> Typical use cases for this series are: > >> - Jumbo-frames > >> - Packet header split (please see Google s use-case @ NetDevConf > >> 0x14, [0]) > >> - TSO > >> > >> More info about the main idea behind this approach can be found here > [1][2]. > >> > >> We carried out some throughput tests in a standard linear frame > >> scenario in order to verify we did not introduced any performance > >> regression adding xdp multi-buff support to mvneta: > >> > >> offered load is ~ 1000Kpps, packet size is 64B, mvneta descriptor > >> size is one PAGE > >> > >> commit: 879456bedbe5 ("net: mvneta: avoid possible cache misses in > mvneta_rx_swbm") > >> - xdp-pass: ~162Kpps > >> - xdp-drop: ~701Kpps > >> - xdp-tx: ~185Kpps > >> - xdp-redirect: ~202Kpps > >> > >> mvneta xdp multi-buff: > >> - xdp-pass: ~163Kpps > >> - xdp-drop: ~739Kpps > >> - xdp-tx: ~182Kpps > >> - xdp-redirect: ~202Kpps > [...]