Hi, +cc bpf list On 5/6/2024 11:19 PM, Chase Hiltz wrote: > Hi, > > I'm writing regarding a rather bizarre scenario that I'm hoping > someone could provide insight on. I have a map defined as follows: > ``` > struct { > __uint(type, BPF_MAP_TYPE_LRU_HASH); > __uint(max_entries, 1000000); > __type(key, struct my_map_key); > __type(value, struct my_map_val); > __uint(map_flags, BPF_F_NO_COMMON_LRU); > __uint(pinning, LIBBPF_PIN_BY_NAME); > } my_map SEC(".maps"); > ``` > I have several fentry/fexit programs that need to perform updates in > this map. After a certain number of map entries has been reached, > calls to bpf_map_update_elem start returning `-ENOMEM`. As one > example, I'm observing a program deployment where we have 816032 > entries on a 64 CPU machine, and a certain portion of updates are > failing. I'm puzzled as to why this is occurring given that: > - The 1M entries should be preallocated upon map creation (since I'm > not using `BPF_F_NO_PREALLOC`) > - The host machine has over 120G of unused memory available at any given time > > I've previously reduced max_entries by 25% under the assumption that > this would prevent the problem from occurring, but this only caused For LRU map with BPF_F_NO_PREALLOC, the number of entries is distributed evenly between all CPUs. For your case, each CPU will have 1M/64 = 15625 entries. In order to reduce of possibility of ENOMEM error, the right way is to increase the value of max_entries instead of decreasing it. > map updates to start failing at a lower threshold. I believe that this > is a problem with maps using the `BPF_F_NO_COMMON_LRU` flag, my > reasoning being that when map updates fail, it occurs consistently for > specific CPUs. Does the specific CPU always fail afterwards, or does it fail periodically ? Is the machine running the bpf program an arm64 host or an x86-64 host (namely uname -a) ? I suspect that the problem may be due to htab_lock_bucket() which may fail under arm64 host in v5.15. Could you please check and account the ratio of times when htab_lru_map_delete_node() returns 0 ? If the ratio high, it probably means that there may be too many overwrites of entries between different CPUs (e.g., CPU 0 updates key=X, then CPU 1 updates the same key again). > At this time, all machines experiencing the problem are running kernel > version 5.15, however I'm not currently able to try out any newer > kernels to confirm whether or not the same problem occurs there. Any > ideas on what could be responsible for this would be greatly > appreciated! > > Thanks, > Chase Hiltz > > .