On 2021-05-09 1:37 a.m., Cong Wang wrote:
On Tue, Apr 27, 2021 at 11:34 AM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote:
[..]
I am pretty sure I showed the original report to you when I sent timeout hashmap patch, in case you forgot here it is again: https://github.com/cilium/cilium/issues/5048 and let me quote the original report here: "The current implementation (as of v1.2) for managing the contents of the datapath connection tracking map leaves something to be desired: Once per minute, the userspace cilium-agent makes a series of calls to the bpf() syscall to fetch all of the entries in the map to determine whether they should be deleted. For each entry in the map, 2-3 calls must be made: One to fetch the next key, one to fetch the value, and perhaps one to delete the entry. The maximum size of the map is 1 million entries, and if the current count approaches this size then the garbage collection goroutine may spend a significant number of CPU cycles iterating and deleting elements from the conntrack map."
That cilium PR was a good read of the general issues. Our use case involves anywhere between 4-16M cached entries. Like i mentioned earlier: we want to periodically, if some condition is met in the kernel on a map entry, to cleanup, update or send unsolicited housekeeping events to user space. Polling in order to achieve this for that many entries is expensive. I would argue, again, timers generally are useful for a variety of house keeping purposes and they are currently missing from ebpf. Again, this despite Cong's use case. Currently things in the ebpf datapath are triggered by either packets showing up or from a control plane perspective by user space polling. We need the timers for completion. cheers, jamal