On Sat, 17 Aug 2019 at 00:08, Jonathan Lemon <jonathan.lemon@xxxxxxxxx> wrote: > On 16 Aug 2019, at 6:32, Björn Töpel wrote: [...] > > > > Today, from a driver perspective, to enable XDP you pass a struct > > bpf_prog pointer via the ndo_bpf. The program get executed in > > BPF_PROG_RUN (via bpf_prog_run_xdp) from include/linux/filter.h. > > > > I think it's possible to achieve what you're doing w/o *any* driver > > modification. Pass a special, invalid, pointer to the driver (say > > (void *)0x1 or smth more elegant), which has a special handling in > > BPF_RUN_PROG e.g. setting a per-cpu state and return XDP_REDIRECT. The > > per-cpu state is picked up in xdp_do_redirect and xdp_flush. > > > > An approach like this would be general, and apply to all modes > > automatically. > > > > Thoughts? > > All the default program does is check that the map entry contains a xsk, > and call bpf_redirect_map(). So this is pretty much the same as above, > without any special case handling. > > Why would this be so expensive? Is the JIT compilation time being > counted? No, not the JIT compilation time, only the fast-path. The gain is from removing the indirect call (hitting a retpoline) when calling the XDP program, and reducing code from xdp_do_redirect/xdp_flush. But, as Jakub pointed out, the XDP batching work by Maciej, might reduce the retpoline impact quite a bit. Björn