On Thu, Dec 8, 2022 at 2:59 PM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: > > Stanislav Fomichev <sdf@xxxxxxxxxx> writes: > > > From: Toke Høiland-Jørgensen <toke@xxxxxxxxxx> > > > > Support RX hash and timestamp metadata kfuncs. We need to pass in the cqe > > pointer to the mlx5e_skb_from* functions so it can be retrieved from the > > XDP ctx to do this. > > So I finally managed to get enough ducks in row to actually benchmark > this. With the caveat that I suddenly can't get the timestamp support to > work (it was working in an earlier version, but now > timestamp_supported() just returns false). I'm not sure if this is an > issue with the enablement patch, or if I just haven't gotten the > hardware configured properly. I'll investigate some more, but figured > I'd post these results now: > > Baseline XDP_DROP: 25,678,262 pps / 38.94 ns/pkt > XDP_DROP + read metadata: 23,924,109 pps / 41.80 ns/pkt > Overhead: 1,754,153 pps / 2.86 ns/pkt > > As per the above, this is with calling three kfuncs/pkt > (metadata_supported(), rx_hash_supported() and rx_hash()). So that's > ~0.95 ns per function call, which is a bit less, but not far off from > the ~1.2 ns that I'm used to. The tests where I accidentally called the > default kfuncs cut off ~1.3 ns for one less kfunc call, so it's > definitely in that ballpark. > > I'm not doing anything with the data, just reading it into an on-stack > buffer, so this is the smallest possible delta from just getting the > data out of the driver. I did confirm that the call instructions are > still in the BPF program bytecode when it's dumped back out from the > kernel. > > -Toke > Oh, that's great, thanks for running the numbers! Will definitely reference them in v4! Presumably, we should be able to at least unroll most of the _supported callbacks if we want, they should be relatively easy; but the numbers look fine as is?