Hi, On 10/18/2022 11:08 PM, Paul E. McKenney wrote: > On Tue, Oct 18, 2022 at 03:31:20PM +0800, Hou Tao wrote: >> Hi, >> >> On 10/17/2022 9:39 PM, Paul E. McKenney wrote: >>> On Fri, Oct 14, 2022 at 07:39:42PM +0800, Hou Tao wrote: SNIP >>> >> Thanks for the review. But it seems I missed another possible use case for >> rcu_trace_implies_rcu_gp() in bpf memory allocator. The code snippet for >> free_mem_alloc() is as following: >> >> static void free_mem_alloc(struct bpf_mem_alloc *ma) >> { >> /* waiting_for_gp lists was drained, but __free_rcu might >> * still execute. Wait for it now before we freeing percpu caches. >> */ >> rcu_barrier_tasks_trace(); >> rcu_barrier(); >> free_mem_alloc_no_barrier(ma); >> } >> >> It uses rcu_barrier_tasks_trace() and rcu_barrier() to wait for the completion >> of pending call_rcu_tasks_trace()s and call_rcu()s. I think it is also safe to >> check rcu_trace_implies_rcu_gp() in free_mem_alloc() and if it is true, there is >> no need to call rcu_barrier(). >> >> static void free_mem_alloc(struct bpf_mem_alloc *ma) >> { >> /* waiting_for_gp lists was drained, but __free_rcu_tasks_trace() >> * or __free_rcu() might still execute. Wait for it now before we >> * freeing percpu caches. >> */ >> rcu_barrier_tasks_trace(); >> if (!rcu_trace_implies_rcu_gp()) >> rcu_barrier(); >> free_mem_alloc_no_barrier(ma); >> } >> >> Does the above change look good to you ? If it is, I will post v3 to include the >> above change and add your Reviewed-by tag. > Unfortunately, although synchronize_rcu_tasks_trace() implies > that synchronize_rcu(), there is no relationship between the > callbacks. Furthermore, rcu_barrier_tasks_trace() does not imply > synchronize_rcu_tasks_trace(). Yes. I see. And according to the code, if there is not pending cb, rcu_barrier_tasks_trace() will returned immediately. It is also possible rcu_tasks_trace kthread is in the middle of grace period waiting when invoking rcu_barrier_task_trace(), so rcu_barrier_task_trace() does not imply synchronize_rcu_tasks_trace(). > > So the above change really would break things. Please do not do it. However I am a little confused about the conclusion. If only considering the invocations of call_rcu() and call_rcu_tasks_trace() in kernel/bpf/memalloc.c, I think it is safe to do so, right ? Because if rcu_trace_implies_rcu_gp() is true, there will be no invocation of call_rcu() and rcu_barrier_tasks_trace() will wait for the completion of pending call_rcu_tasks_trace(). If rcu_trace_implies_rcu_gp(), rcu_barrier_tasks_trace() and rcu_barrier() will do the job. If considering the invocations of call_rcu() in other places, I think it is definitely unsafe to do that, right ? > > You could use workqueues or similar to make the rcu_barrier_tasks_trace() > and the rcu_barrier() wait concurrently, though. This would of course > require some synchronization. Thanks for the suggestion. Will check it later. > > Thanx, Paul > >>>> Change Log: >>>> >>>> v2: >>>> * codify the implication of RCU Tasks Trace grace period instead of >>>> assuming for it >>>> >>>> v1: https://lore.kernel.org/bpf/20221011071128.3470622-1-houtao@xxxxxxxxxxxxxxx >>>> >>>> Hou Tao (3): >>>> bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator >>>> bpf: Use rcu_trace_implies_rcu_gp() in local storage map >>>> bpf: Use rcu_trace_implies_rcu_gp() for program array freeing >>>> >>>> Paul E. McKenney (1): >>>> rcu-tasks: Provide rcu_trace_implies_rcu_gp() >>>> >>>> include/linux/rcupdate.h | 12 ++++++++++++ >>>> kernel/bpf/bpf_local_storage.c | 13 +++++++++++-- >>>> kernel/bpf/core.c | 8 +++++++- >>>> kernel/bpf/memalloc.c | 15 ++++++++++----- >>>> kernel/rcu/tasks.h | 2 ++ >>>> 5 files changed, 42 insertions(+), 8 deletions(-) >>>> >>>> -- >>>> 2.29.2 >>>> >>> .