Re: [PATCH bpf-next v2 1/3] bpf: Disable preemption when increasing per-cpu map_locked

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 8/29/2022 6:39 AM, KP Singh wrote:
> On Sat, Aug 27, 2022 at 11:43 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
>> From: Hou Tao <houtao1@xxxxxxxxxx>
>>
>> Per-cpu htab->map_locked is used to prohibit the concurrent accesses
>> from both NMI and non-NMI contexts. But since commit 74d862b682f5
>> ("sched: Make migrate_disable/enable() independent of RT"),
>> migrations_disable() is also preemptible under CONFIG_PREEMPT case,
> nit: migrate_disable
Will fix in v3.
>
>> so now map_locked also disallows concurrent updates from normal contexts
>> (e.g. userspace processes) unexpectedly as shown below:
>>
>> process A                      process B
>>
>> htab_map_update_elem()
>>   htab_lock_bucket()
>>     migrate_disable()
>>     /* return 1 */
>>     __this_cpu_inc_return()
>>     /* preempted by B */
>>
>>                                htab_map_update_elem()
>>                                  /* the same bucket as A */
>>                                  htab_lock_bucket()
>>                                    migrate_disable()
>>                                    /* return 2, so lock fails */
>>                                    __this_cpu_inc_return()
>>                                    return -EBUSY
>>
>> A fix that seems feasible is using in_nmi() in htab_lock_bucket() and
>> only checking the value of map_locked for nmi context. But it will
>> re-introduce dead-lock on bucket lock if htab_lock_bucket() is re-entered
>> through non-tracing program (e.g. fentry program).
>>
>> So fixing it by using disable_preempt() instead of migrate_disable() when
>> increasing htab->map_locked. However when htab_use_raw_lock() is false,
>> bucket lock will be a sleepable spin-lock and it breaks disable_preempt(),
>> so still use migrate_disable() for spin-lock case and leave the
>> concurrent map updates problem to BPF memory allocator patchset in which
>> !htab_use_raw_lock() case will be removed.
> I think the description needs a bit more clarity and I think you mean
> preempt_disable() instead of  disable_preempt
>
> Suggestion:
>
> One cannot use preempt_disable() to fix this issue as htab_use_raw_lock
> being false causes the bucket lock to be a spin lock which can sleep and
> does not work with preempt_disable().
>
> Therefore, use migrate_disable() when using the spinlock instead of
> preempt_disable() and defer fixing concurrent updates to when the kernel
> has its own BPF memory allocator.
Will update the commit message in v3 and thanks for the suggestion.
>
>> Reviewed-by: Hao Luo <haoluo@xxxxxxxxxx>
>> Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx>
> Fixes tag here please?
Will add "Fixes: 74d862b682f5 ("sched: Make migrate_disable/enable() independent
of RT")" in v3.
>
>
>> ---
>>  kernel/bpf/hashtab.c | 23 ++++++++++++++++++-----
>>  1 file changed, 18 insertions(+), 5 deletions(-)
>>
>> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
>> index b301a63afa2f..6fb3b7fd1622 100644
>> --- a/kernel/bpf/hashtab.c
>> +++ b/kernel/bpf/hashtab.c
>> @@ -162,17 +162,25 @@ static inline int htab_lock_bucket(const struct bpf_htab *htab,
>>                                    unsigned long *pflags)
>>  {
>>         unsigned long flags;
>> +       bool use_raw_lock;
>>
>>         hash = hash & HASHTAB_MAP_LOCK_MASK;
>>
>> -       migrate_disable();
>> +       use_raw_lock = htab_use_raw_lock(htab);
>> +       if (use_raw_lock)
>> +               preempt_disable();
>> +       else
>> +               migrate_disable();
>>         if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
>>                 __this_cpu_dec(*(htab->map_locked[hash]));
>> -               migrate_enable();
>> +               if (use_raw_lock)
>> +                       preempt_enable();
>> +               else
>> +                       migrate_enable();
>>                 return -EBUSY;
>>         }
>>
>> -       if (htab_use_raw_lock(htab))
>> +       if (use_raw_lock)
>>                 raw_spin_lock_irqsave(&b->raw_lock, flags);
>>         else
>>                 spin_lock_irqsave(&b->lock, flags);
>> @@ -185,13 +193,18 @@ static inline void htab_unlock_bucket(const struct bpf_htab *htab,
>>                                       struct bucket *b, u32 hash,
>>                                       unsigned long flags)
>>  {
>> +       bool use_raw_lock = htab_use_raw_lock(htab);
>> +
>>         hash = hash & HASHTAB_MAP_LOCK_MASK;
>> -       if (htab_use_raw_lock(htab))
>> +       if (use_raw_lock)
>>                 raw_spin_unlock_irqrestore(&b->raw_lock, flags);
>>         else
>>                 spin_unlock_irqrestore(&b->lock, flags);
>>         __this_cpu_dec(*(htab->map_locked[hash]));
>> -       migrate_enable();
>> +       if (use_raw_lock)
>> +               preempt_enable();
>> +       else
>> +               migrate_enable();
>>  }
>>
>>  static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node);
>> --
>> 2.29.2
>>
> .




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux