On Thu, 26 Aug 2021 at 20:40, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > Are you doing three parallel test_run commands > with repeat=1 and doing this syscall 1m times? > yeah, that would stress bpf_dispatcher_update() logic nicely :) Yup, exactly. > 3m accesses to the same mutex and flip flop of a single page > with tlb flush and text_poke_bp. > Can your test harness use test_run with repeat = 1m instead? > Or it's not possible, since input data is different every time? The input changes for every run. > I think avoiding xdp dispatcher for repeat=1 makes sense. > Folks might be using this facility in similar fashion and > paying the dispatcher penalty for a single run is unnecessary. > While at it would be good to add the test_run specific xdp dispatcher. > Since right now all netdevs share a single global xdp dispatcher. I guess a side effect of this is that too many BPF_PROG_TEST_RUN (aka BPF_PROG_RUN now) may slow down programs attaching to a NIC? > 100 parallel xdp test_run threads will probably fail because > they will reach BPF_DISPATCHER_MAX limit. If there are too many concurrent progs the dispatcher just becomes a no-op rather than return an error, I think. Lorenz -- Lorenz Bauer | Systems Engineer 6th Floor, County Hall/The Riverside Building, SE1 7PB, UK www.cloudflare.com