Hi, On 10/17/2022 9:39 PM, Paul E. McKenney wrote: > On Fri, Oct 14, 2022 at 07:39:42PM +0800, Hou Tao wrote: >> From: Hou Tao <houtao1@xxxxxxxxxx> >> >> Hi, >> >> Now bpf uses RCU grace period chaining to wait for the completion of >> access from both sleepable and non-sleepable bpf program: calling >> call_rcu_tasks_trace() firstly to wait for a RCU-tasks-trace grace >> period, then in its callback calls call_rcu() or kfree_rcu() to wait for >> a normal RCU grace period. >> >> According to the implementation of RCU Tasks Trace, it inovkes >> ->postscan_func() to wait for one RCU-tasks-trace grace period and >> rcu_tasks_trace_postscan() inovkes synchronize_rcu() to wait for one >> normal RCU grace period in turn, so one RCU-tasks-trace grace period >> will imply one normal RCU grace period. To codify the implication, >> introduces rcu_trace_implies_rcu_gp() in patch #1. And using it in patch >> #2~#4 to remove unnecessary call_rcu() or kfree_rcu() in bpf subsystem. >> Other two uses of call_rcu_tasks_trace() are unchanged: for >> __bpf_prog_put_rcu() there is no gp chain and for >> __bpf_tramp_image_put_rcu_tasks() it chains RCU tasks trace GP and RCU >> tasks GP. >> >> An alternative way to remove these unnecessary RCU grace period >> chainings is using the RCU polling API to check whether or not a normal >> RCU grace period has passed (e.g. get_state_synchronize_rcu()). But it >> needs an unsigned long space for each free element or each call, and >> it is not affordable for local storage element, so as for now always >> rcu_trace_implies_rcu_gp(). >> >> Comments are always welcome. > For #2-#4: > > Reviewed-by: Paul E. McKenney <paulmck@xxxxxxxxxx> > > (#1 already has my Signed-off-by, in case anyone was wondering.) Thanks for the review. But it seems I missed another possible use case for rcu_trace_implies_rcu_gp() in bpf memory allocator. The code snippet for free_mem_alloc() is as following: static void free_mem_alloc(struct bpf_mem_alloc *ma) { /* waiting_for_gp lists was drained, but __free_rcu might * still execute. Wait for it now before we freeing percpu caches. */ rcu_barrier_tasks_trace(); rcu_barrier(); free_mem_alloc_no_barrier(ma); } It uses rcu_barrier_tasks_trace() and rcu_barrier() to wait for the completion of pending call_rcu_tasks_trace()s and call_rcu()s. I think it is also safe to check rcu_trace_implies_rcu_gp() in free_mem_alloc() and if it is true, there is no need to call rcu_barrier(). static void free_mem_alloc(struct bpf_mem_alloc *ma) { /* waiting_for_gp lists was drained, but __free_rcu_tasks_trace() * or __free_rcu() might still execute. Wait for it now before we * freeing percpu caches. */ rcu_barrier_tasks_trace(); if (!rcu_trace_implies_rcu_gp()) rcu_barrier(); free_mem_alloc_no_barrier(ma); } Does the above change look good to you ? If it is, I will post v3 to include the above change and add your Reviewed-by tag. > >> Change Log: >> >> v2: >> * codify the implication of RCU Tasks Trace grace period instead of >> assuming for it >> >> v1: https://lore.kernel.org/bpf/20221011071128.3470622-1-houtao@xxxxxxxxxxxxxxx >> >> Hou Tao (3): >> bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator >> bpf: Use rcu_trace_implies_rcu_gp() in local storage map >> bpf: Use rcu_trace_implies_rcu_gp() for program array freeing >> >> Paul E. McKenney (1): >> rcu-tasks: Provide rcu_trace_implies_rcu_gp() >> >> include/linux/rcupdate.h | 12 ++++++++++++ >> kernel/bpf/bpf_local_storage.c | 13 +++++++++++-- >> kernel/bpf/core.c | 8 +++++++- >> kernel/bpf/memalloc.c | 15 ++++++++++----- >> kernel/rcu/tasks.h | 2 ++ >> 5 files changed, 42 insertions(+), 8 deletions(-) >> >> -- >> 2.29.2 >> > .