On Mon, 29 Apr 2024 at 18:57, Siddharth Chintamaneni <sidchintamaneni@xxxxxxxxx> wrote: > > From: Siddharth Chintamaneni <sidchintamaneni@xxxxxx> > > This patch address a possible deadlock issue in queue and > stack map types. > > Deadlock could happen when a nested BPF program > acquires the same lock as the parent BPF program > to perform a write operation on the same map as > the first one. This bug is also reported by > syzbot. > > Link: https://lore.kernel.org/lkml/0000000000004c3fc90615f37756@xxxxxxxxxx/ > Reported-by: syzbot+8bdfc2c53fb2b63e1871@xxxxxxxxxxxxxxxxxxxxxxxxx > Fixes: f1a2e44a3aec ("bpf: add queue and stack maps") > Signed-off-by: Siddharth Chintamaneni <sidchintamaneni@xxxxxx> > --- > kernel/bpf/queue_stack_maps.c | 42 +++++++++++++++++++++++++++++++++++ > 1 file changed, 42 insertions(+) > > diff --git a/kernel/bpf/queue_stack_maps.c b/kernel/bpf/queue_stack_maps.c > index d869f51ea93a..4b7df1a53cf2 100644 > --- a/kernel/bpf/queue_stack_maps.c > +++ b/kernel/bpf/queue_stack_maps.c > @@ -18,6 +18,7 @@ struct bpf_queue_stack { > raw_spinlock_t lock; > u32 head, tail; > u32 size; /* max_entries + 1 */ > + int __percpu *map_locked; > > char elements[] __aligned(8); > }; > @@ -78,6 +79,16 @@ static struct bpf_map *queue_stack_map_alloc(union bpf_attr *attr) > > qs->size = size; > > + qs->map_locked = bpf_map_alloc_percpu(&qs->map, > + sizeof(int), > + sizeof(int), > + GFP_USER); GFP_USER | __GFP_NOWARN, like we do everywhere else. > + if (!qs->map_locked) { > + bpf_map_area_free(qs); > + return ERR_PTR(-ENOMEM); > + } > + > + > raw_spin_lock_init(&qs->lock); > > return &qs->map; > @@ -98,6 +109,16 @@ static long __queue_map_get(struct bpf_map *map, void *value, bool delete) > int err = 0; > void *ptr; > > + preempt_disable(); > + local_irq_save(flags); > + if (unlikely(__this_cpu_inc_return(*(qs->map_locked)) != 1)) { > + __this_cpu_dec(*(qs->map_locked)); > + local_irq_restore(flags); > + preempt_enable(); > + return -EBUSY; > + } > + preempt_enable(); > + You increment, but don't decrement the map_locked counter after unlock. Likewise in all other cases. Then this operation cannot be called anymore after the first time on a given cpu for a given map. Probably why CI is also failing. https://github.com/kernel-patches/bpf/actions/runs/8882578097/job/24387802831?pr=6915 returns -16 (EBUSY). E.g. check hashtab.c, it does __this_cpu_dec after unlock in htab_unlock_bucket. > [...] > >