On Sat, Aug 27, 2022 at 11:43 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > > From: Hou Tao <houtao1@xxxxxxxxxx> > > Per-cpu htab->map_locked is used to prohibit the concurrent accesses > from both NMI and non-NMI contexts. But since commit 74d862b682f5 > ("sched: Make migrate_disable/enable() independent of RT"), > migrations_disable() is also preemptible under CONFIG_PREEMPT case, nit: migrate_disable > so now map_locked also disallows concurrent updates from normal contexts > (e.g. userspace processes) unexpectedly as shown below: > > process A process B > > htab_map_update_elem() > htab_lock_bucket() > migrate_disable() > /* return 1 */ > __this_cpu_inc_return() > /* preempted by B */ > > htab_map_update_elem() > /* the same bucket as A */ > htab_lock_bucket() > migrate_disable() > /* return 2, so lock fails */ > __this_cpu_inc_return() > return -EBUSY > > A fix that seems feasible is using in_nmi() in htab_lock_bucket() and > only checking the value of map_locked for nmi context. But it will > re-introduce dead-lock on bucket lock if htab_lock_bucket() is re-entered > through non-tracing program (e.g. fentry program). > > So fixing it by using disable_preempt() instead of migrate_disable() when > increasing htab->map_locked. However when htab_use_raw_lock() is false, > bucket lock will be a sleepable spin-lock and it breaks disable_preempt(), > so still use migrate_disable() for spin-lock case and leave the > concurrent map updates problem to BPF memory allocator patchset in which > !htab_use_raw_lock() case will be removed. I think the description needs a bit more clarity and I think you mean preempt_disable() instead of disable_preempt Suggestion: One cannot use preempt_disable() to fix this issue as htab_use_raw_lock being false causes the bucket lock to be a spin lock which can sleep and does not work with preempt_disable(). Therefore, use migrate_disable() when using the spinlock instead of preempt_disable() and defer fixing concurrent updates to when the kernel has its own BPF memory allocator. > > Reviewed-by: Hao Luo <haoluo@xxxxxxxxxx> > Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx> Fixes tag here please? > --- > kernel/bpf/hashtab.c | 23 ++++++++++++++++++----- > 1 file changed, 18 insertions(+), 5 deletions(-) > > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c > index b301a63afa2f..6fb3b7fd1622 100644 > --- a/kernel/bpf/hashtab.c > +++ b/kernel/bpf/hashtab.c > @@ -162,17 +162,25 @@ static inline int htab_lock_bucket(const struct bpf_htab *htab, > unsigned long *pflags) > { > unsigned long flags; > + bool use_raw_lock; > > hash = hash & HASHTAB_MAP_LOCK_MASK; > > - migrate_disable(); > + use_raw_lock = htab_use_raw_lock(htab); > + if (use_raw_lock) > + preempt_disable(); > + else > + migrate_disable(); > if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) { > __this_cpu_dec(*(htab->map_locked[hash])); > - migrate_enable(); > + if (use_raw_lock) > + preempt_enable(); > + else > + migrate_enable(); > return -EBUSY; > } > > - if (htab_use_raw_lock(htab)) > + if (use_raw_lock) > raw_spin_lock_irqsave(&b->raw_lock, flags); > else > spin_lock_irqsave(&b->lock, flags); > @@ -185,13 +193,18 @@ static inline void htab_unlock_bucket(const struct bpf_htab *htab, > struct bucket *b, u32 hash, > unsigned long flags) > { > + bool use_raw_lock = htab_use_raw_lock(htab); > + > hash = hash & HASHTAB_MAP_LOCK_MASK; > - if (htab_use_raw_lock(htab)) > + if (use_raw_lock) > raw_spin_unlock_irqrestore(&b->raw_lock, flags); > else > spin_unlock_irqrestore(&b->lock, flags); > __this_cpu_dec(*(htab->map_locked[hash])); > - migrate_enable(); > + if (use_raw_lock) > + preempt_enable(); > + else > + migrate_enable(); > } > > static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node); > -- > 2.29.2 >