Re: [PATCH bpf v2 4/5] bpf: Optimize the free of inner map

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 13, 2023 at 08:33:23PM +0800, Hou Tao wrote:
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index e2d2701ce2c45..5a7906f2b027e 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -694,12 +694,20 @@ static void bpf_map_free_deferred(struct work_struct *work)
>  {
>  	struct bpf_map *map = container_of(work, struct bpf_map, work);
>  	struct btf_record *rec = map->record;
> +	int acc_ctx;
>  
>  	security_bpf_map_free(map);
>  	bpf_map_release_memcg(map);
>  
> -	if (READ_ONCE(map->free_after_mult_rcu_gp))
> -		synchronize_rcu_mult(call_rcu, call_rcu_tasks_trace);

The previous patch 3 is doing too much.
There is maybe_wait_bpf_programs() that will do synchronize_rcu()
when necessary.
The patch 3 could do synchronize_rcu_tasks_trace() only and it will solve the issue.

> +	acc_ctx = atomic_read(&map->may_be_accessed_prog_ctx) & BPF_MAP_ACC_PROG_CTX_MASK;
> +	if (acc_ctx) {
> +		if (acc_ctx == BPF_MAP_ACC_NORMAL_PROG_CTX)
> +			synchronize_rcu();
> +		else if (acc_ctx == BPF_MAP_ACC_SLEEPABLE_PROG_CTX)
> +			synchronize_rcu_tasks_trace();
> +		else
> +			synchronize_rcu_mult(call_rcu, call_rcu_tasks_trace);

and this patch 4 goes to far.
Could you add sleepable_refcnt in addition to existing refcnt that is incremented
in outer map when it's used by sleepable prog and when sleepable_refcnt > 0
the caller of bpf_map_free_deferred sets free_after_mult_rcu_gp.
(which should be renamed to free_after_tasks_rcu_gp).
Patch 3 is simpler and patch 4 is simple too.
No need for atomic_or games.

In addition I'd like to see an extra patch that demonstrates this UAF
when update/delete is done by syscall bpf prog type.
The test case in patch 5 is doing update/delete from user space.
If that was the only issue we could have easily extended maybe_wait_bpf_programs()
to do synchronize_rcu_tasks_trace() and that would close the issue exposed by patch 5.
But inner maps can indeed be updated by syscall bpf prog and since they run
under rcu_read_lock_trace() we cannot add synchronize_rcu_tasks_trace() to
maybe_wait_bpf_programs() because it will deadlock.
So let's make sure we have test cases for all combinations where inner maps
can be updated/deleted.




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux