Hi Yonghong,
On 12/19/24 10:45 AM, Yonghong Song Wrote:
On 12/18/24 1:21 AM, Abel Wu wrote:
The following commit
bc235cdb423a ("bpf: Prevent deadlock from recursive bpf_task_storage_[get|delete]")
first introduced deadlock prevention for fentry/fexit programs attaching
on bpf_task_storage helpers. That commit also employed the logic in map
free path in its v6 version.
Later bpf_cgrp_storage was first introduced in
c4bcfb38a95e ("bpf: Implement cgroup storage available to non-cgroup-attached bpf progs")
which faces the same issue as bpf_task_storage, instead of its busy
counter, NULL was passed to bpf_local_storage_map_free() which opened
a window to cause deadlock:
<TASK>
_raw_spin_lock_irqsave+0x3d/0x50
bpf_local_storage_update+0xd1/0x460
bpf_cgrp_storage_get+0x109/0x130
bpf_prog_72026450ec387477_cgrp_ptr+0x38/0x5e
bpf_trace_run1+0x84/0x100
cgroup_storage_ptr+0x4c/0x60
bpf_selem_unlink_storage_nolock.constprop.0+0x135/0x160
bpf_selem_unlink_storage+0x6f/0x110
bpf_local_storage_map_free+0xa2/0x110
bpf_map_free_deferred+0x5b/0x90
process_one_work+0x17c/0x390
worker_thread+0x251/0x360
kthread+0xd2/0x100
ret_from_fork+0x34/0x50
ret_from_fork_asm+0x1a/0x30
</TASK>
[ Since the verifier treats 'void *' as scalar which
prevents me from getting a pointer to 'struct cgroup *',
I added a raw tracepoint in cgroup_storage_ptr() to
help reproducing this issue. ]
Although it is tricky to reproduce, the risk of deadlock exists and
worthy of a fix, by passing its busy counter to the free procedure so
it can be properly incremented before storage/smap locking.
The above stack trace and explanation does not show that we will have
a potential dead lock here. You mentioned that it is tricky to reproduce,
does it mean that you have done some analysis or coding to reproduce it?
Could you share the details on why you think we may have deadlock here?
The stack is A-A deadlocked: cgroup_storage_ptr() is called with
storage->lock held, while the bpf_prog attaching on this function
also tries to acquire the same lock by calling bpf_cgrp_storage_get()
thus leading to a AA deadlock.
The tricky part is, instead of attaching on cgroup_storage_ptr()
directly, I added a tracepoint inside it to hook:
------
diff --git a/kernel/bpf/bpf_cgrp_storage.c b/kernel/bpf/bpf_cgrp_storage.c
index 20f05de92e9c..679209d4f88f 100644
--- a/kernel/bpf/bpf_cgrp_storage.c
+++ b/kernel/bpf/bpf_cgrp_storage.c
@@ -40,6 +40,8 @@ static struct bpf_local_storage __rcu **cgroup_storage_ptr(void *owner)
{
struct cgroup *cg = owner;
+ trace_cgroup_ptr(cg);
+
return &cg->bpf_cgrp_storage;
}
------
The reason doing so is that typecasting from 'void *owner' to
'struct cgroup *' will be rejected by the verifier. But there
could be other ways to obtain a pointer to the @owner cgroup
too, making the deadlock possible.
Thanks,
Abel
Fixes: c4bcfb38a95e ("bpf: Implement cgroup storage available to non-cgroup-attached bpf progs")
Signed-off-by: Abel Wu <wuyun.abel@xxxxxxxxxxxxx>
---
kernel/bpf/bpf_cgrp_storage.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bpf/bpf_cgrp_storage.c b/kernel/bpf/bpf_cgrp_storage.c
index 20f05de92e9c..7996fcea3755 100644
--- a/kernel/bpf/bpf_cgrp_storage.c
+++ b/kernel/bpf/bpf_cgrp_storage.c
@@ -154,7 +154,7 @@ static struct bpf_map *cgroup_storage_map_alloc(union bpf_attr *attr)
static void cgroup_storage_map_free(struct bpf_map *map)
{
- bpf_local_storage_map_free(map, &cgroup_cache, NULL);
+ bpf_local_storage_map_free(map, &cgroup_cache, &bpf_cgrp_storage_busy);
}
/* *gfp_flags* is a hidden argument provided by the verifier */