Re: [PATCH 1/3] bpf: Disable preemption when increasing per-cpu map_locked

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Hou Tao,

On Sat, Aug 20, 2022 at 8:14 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
>
> From: Hou Tao <houtao1@xxxxxxxxxx>
>
> Per-cpu htab->map_locked is used to prohibit the concurrent accesses
> from both NMI and non-NMI contexts. But since 74d862b682f51 ("sched:
> Make migrate_disable/enable() independent of RT"), migrations_disable()
> is also preemptible under !PREEMPT_RT case, so now map_locked also
> disallows concurrent updates from normal contexts (e.g. userspace
> processes) unexpectedly as shown below:
>
> process A                      process B
>
> htab_map_update_elem()
>   htab_lock_bucket()
>     migrate_disable()
>     /* return 1 */
>     __this_cpu_inc_return()
>     /* preempted by B */
>
>                                htab_map_update_elem()
>                                  /* the same bucket as A */
>                                  htab_lock_bucket()
>                                    migrate_disable()
>                                    /* return 2, so lock fails */
>                                    __this_cpu_inc_return()
>                                    return -EBUSY
>
> A fix that seems feasible is using in_nmi() in htab_lock_bucket() and
> only checking the value of map_locked for nmi context. But it will
> re-introduce dead-lock on bucket lock if htab_lock_bucket() is re-entered
> through non-tracing program (e.g. fentry program).
>
> So fixing it by using disable_preempt() instead of migrate_disable() when
> increasing htab->map_locked. However when htab_use_raw_lock() is false,
> bucket lock will be a sleepable spin-lock and it breaks disable_preempt(),
> so still use migrate_disable() for spin-lock case.
>
> Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx>
> ---

IIUC, this patch enlarges the scope of preemption disable to cover inc
map_locked. But I don't think the change is meaningful.

This patch only affects the case when raw lock is used. In the case of
raw lock, irq is disabled for b->raw_lock protected critical section.
A raw spin lock itself doesn't block in both RT and non-RT. So, my
understanding about this patch is, it just makes sure preemption
doesn't happen on the exact __this_cpu_inc_return. But the window is
so small that it should be really unlikely to happen.

>  kernel/bpf/hashtab.c | 23 ++++++++++++++++++-----
>  1 file changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 6c530a5e560a..ad09da139589 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -162,17 +162,25 @@ static inline int htab_lock_bucket(const struct bpf_htab *htab,
>                                    unsigned long *pflags)
>  {
>         unsigned long flags;
> +       bool use_raw_lock;
>
>         hash = hash & HASHTAB_MAP_LOCK_MASK;
>
> -       migrate_disable();
> +       use_raw_lock = htab_use_raw_lock(htab);
> +       if (use_raw_lock)
> +               preempt_disable();
> +       else
> +               migrate_disable();
>         if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
>                 __this_cpu_dec(*(htab->map_locked[hash]));
> -               migrate_enable();
> +               if (use_raw_lock)
> +                       preempt_enable();
> +               else
> +                       migrate_enable();
>                 return -EBUSY;
>         }
>
> -       if (htab_use_raw_lock(htab))
> +       if (use_raw_lock)
>                 raw_spin_lock_irqsave(&b->raw_lock, flags);
>         else
>                 spin_lock_irqsave(&b->lock, flags);
> @@ -185,13 +193,18 @@ static inline void htab_unlock_bucket(const struct bpf_htab *htab,
>                                       struct bucket *b, u32 hash,
>                                       unsigned long flags)
>  {
> +       bool use_raw_lock = htab_use_raw_lock(htab);
> +
>         hash = hash & HASHTAB_MAP_LOCK_MASK;
> -       if (htab_use_raw_lock(htab))
> +       if (use_raw_lock)
>                 raw_spin_unlock_irqrestore(&b->raw_lock, flags);
>         else
>                 spin_unlock_irqrestore(&b->lock, flags);
>         __this_cpu_dec(*(htab->map_locked[hash]));
> -       migrate_enable();
> +       if (use_raw_lock)
> +               preempt_enable();
> +       else
> +               migrate_enable();
>  }
>
>  static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node);
> --
> 2.29.2
>



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux