On Thu, 16 May 2024 at 10:34, Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > On Thu, 16 May 2024 at 16:05, Barret Rhoden <brho@xxxxxxxxxx> wrote: > > > > On 5/14/24 08:40, Siddharth Chintamaneni wrote: > > [...] > > > +static inline int map_lock_inc(struct bpf_queue_stack *qs) > > > +{ > > > + unsigned long flags; > > > + > > > + preempt_disable(); > > > + local_irq_save(flags); > > > + if (unlikely(__this_cpu_inc_return(*(qs->map_locked)) != 1)) { > > > + __this_cpu_dec(*(qs->map_locked)); > > > + local_irq_restore(flags); > > > + preempt_enable(); > > > + return -EBUSY; > > > + } > > > + > > > + local_irq_restore(flags); > > > + preempt_enable(); > > > > it looks like you're taking the approach from kernel/bpf/hashtab.c to > > use a per-cpu lock before grabbing the real lock. but in the success > > case here (where you incremented the percpu counter), you're enabling > > irqs and preemption. > > > > what happens if you get preempted right after this? you've left the > > per-cpu bit set, but then you run on another cpu. > > Great catch, that's a bug. It's not a problem when BPF programs call > this, as migration is disabled for them (but it's questionable whether > we should keep preemption enabled between map_inc/dec increasing the > chances of conflicts on the same CPU), but it's certainly a problem > from the syscall path. > I was also thinking from the BPF programs perspective as migration is disabled on them. I will fix this. > > > > possible alternative: instead of splitting the overall lock into "grab > > percpu lock, then grab real lock", have a single function for both, > > similar to htab_lock_bucket(). and keep irqs and preemption off from > > the moment you start attempting the overall lock until you completely > > unlock. > > +1. > > > > > barret > >