Question: BPF maps reliability

Yosry Ahmed <yosryahmed@xxxxxxxxxx> · Wed, 2 Nov 2022 11:47:31 -0700

Hey everyone,

TL;DR Are BPF map operations guaranteed to succeed if the map is
configured correctly and accesses to the map do not interrupt each
other? Can this be relied on in the future as well?

I am looking into migrating some cgroup statistics we internally
maintain to use BPF instead of in-kernel code. I am considering
several aspects of that, including reliability. With in-kernel code
things are really simple, we add the data structures containing the
stats to cgroup controller struct, we update them as appropriate, and
we export them when needed. With BPF, we need to hook progs to the
right locations and store the stats in BPF maps (cgroup local
storages, task local storages, hash tables, trees - in the future -)
etc.

The question I am asking here is about the reliability of such map
operations. Looking at the code for lookups and updates for some map
types, I can see a lot of failure cases. Looking deeper into them it
*seems* to me like in an ideal scenario nothing should fail. By an
ideal scenario I mean:
- The map size is set correctly,
- There is sufficient memory on the system,
- We don't use the BPF maps in any progs attached to the BPF maps
manipulation code itself,
- We don't use the BPF maps in any progs that can interrupt each other
(e.g. NMI context).

IOW, there are no cases where we fail because two programs running in
parallel are trying to access the same map (or map element) or because
we couldn't acquire a resource that we don't want to wait on (that
wouldn't result in a deadlock)., situations where we might prefer the
caller to retry later or where we don't care about one missed
operation.

Maybe all of this is obvious and I am being paranoid, or maybe there
are other obvious failure cases that I missed, or maybe this is just a
dumb question, so I apologize in advance if any of this is true :)

Thanks!