Hey everyone, TL;DR Are BPF map operations guaranteed to succeed if the map is configured correctly and accesses to the map do not interrupt each other? Can this be relied on in the future as well? I am looking into migrating some cgroup statistics we internally maintain to use BPF instead of in-kernel code. I am considering several aspects of that, including reliability. With in-kernel code things are really simple, we add the data structures containing the stats to cgroup controller struct, we update them as appropriate, and we export them when needed. With BPF, we need to hook progs to the right locations and store the stats in BPF maps (cgroup local storages, task local storages, hash tables, trees - in the future -) etc. The question I am asking here is about the reliability of such map operations. Looking at the code for lookups and updates for some map types, I can see a lot of failure cases. Looking deeper into them it *seems* to me like in an ideal scenario nothing should fail. By an ideal scenario I mean: - The map size is set correctly, - There is sufficient memory on the system, - We don't use the BPF maps in any progs attached to the BPF maps manipulation code itself, - We don't use the BPF maps in any progs that can interrupt each other (e.g. NMI context). IOW, there are no cases where we fail because two programs running in parallel are trying to access the same map (or map element) or because we couldn't acquire a resource that we don't want to wait on (that wouldn't result in a deadlock)., situations where we might prefer the caller to retry later or where we don't care about one missed operation. Maybe all of this is obvious and I am being paranoid, or maybe there are other obvious failure cases that I missed, or maybe this is just a dumb question, so I apologize in advance if any of this is true :) Thanks!