On 3/26/24 1:12 AM, Yafang Shao wrote:
Following the recent upgrade of one of our BPF programs, we encountered significant latency spikes affecting other applications running on the same host. After thorough investigation, we identified that these spikes were primarily caused by the prolonged duration required to free a non-preallocated htab with approximately 2 million keys. Notably, our kernel configuration lacks the presence of CONFIG_PREEMPT. In scenarios where kernel execution extends excessively, other threads might be starved of CPU time, resulting in latency issues across the system. To mitigate this, we've adopted a proactive approach by incorporating cond_resched() calls within the kernel code. This ensures that during lengthy kernel operations, the scheduler is invoked periodically to provide opportunities for other threads to execute. Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> --- kernel/bpf/hashtab.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 3a088a5349bc..d3d5aad045cc 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -1489,6 +1489,7 @@ static void delete_all_elements(struct bpf_htab *htab) hlist_nulls_for_each_entry_safe(l, n, head, hash_node) { hlist_nulls_del_rcu(&l->hash_node); htab_elem_free(htab, l); + cond_resched(); }
should we put cond_resched() here inside the top 'for' loop, but outside the bucket loop? Do you really have a long link list for a particular bucket? Otherwise, the patch looks good to me. In hashtab.c, we have cond_resched() in some other places to mitigate similar issues.
} migrate_enable();