On 9/8/23 04:42, Paul E. McKenney wrote:
But what BPF programs are you running that are seeing excessive
synchronization overhead? That will tell us which operations to start
with. (Or maybe it is time to just add the full Linux-kernel
atomic-operations kitchen sink, but that would not normally be the way
to bet.)
Here's what I use in BPF, (also for writing parallel schedulers):
- READ_ONCE/WRITE_ONCE
- compiler atomic builtins, like CAS, swap/exchange, fetch_and_add, etc.
- smp_store_release, __atomic_load_n, etc.
- at one point, i was sprinkling asm volatile ("" ::: "memory") around
too, though not in any active code at the moment.
My mental model, right or wrong, is that I am operating under something
like the LKMM, and that I need to convince the compiler to spit out the
right code (sort of like writing shared memory code to talk to a device
or userspace) and hope the JIT does the right thing.
Barret