Re: [PATCH bpf-next v2 3/6] locking/local_lock: Introduce local_trylock_irqsave()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 11, 2024 at 3:55 AM Vlastimil Babka <vbabka@xxxxxxx> wrote:
>
> On 12/11/24 11:53, Vlastimil Babka wrote:
> > On 12/10/24 03:39, Alexei Starovoitov wrote:
> >> From: Alexei Starovoitov <ast@xxxxxxxxxx>
> >>
> >> Similar to local_lock_irqsave() introduce local_trylock_irqsave().
> >> It uses spin_trylock in PREEMPT_RT and always succeeds when !RT.
> >
> > Hmm but is that correct to always succeed? If we're in an nmi, we might be
> > preempting an existing local_(try)lock_irqsave() critical section because
> > disabling irqs doesn't stop NMI's, right?
>
> So unless I'm missing something, it would need to be a new kind of local
> lock to support this trylock operation on !RT?

Ohh. Correct. Forgot about nmi interrupting local_lock_irqsave region in !RT.

> Perhaps based on the same
> principle of a simple active/locked flag that I tried in my sheaves RFC? [1]
> There could be also the advantage that if all (potentially) irq contexts
> (not just nmi) used trylock, it would be sufficient to disable preeemption
> and not interrupts, which is cheaper.

I don't think it's the case.
pushf was slow on old x86.
According to https://www.agner.org/optimize/instruction_tables.pdf
it's 3 uops on skylake.
That could be faster than preempt_disable (incl %gs:addr)
which is 3-4 uops assuming cache hit.

> The RT variant could work as you proposed here, that was wrong in my RFC as
> you already pointed out when we discussed v1 of this series.
>
> [1]
> https://lore.kernel.org/all/20241112-slub-percpu-caches-v1-5-ddc0bdc27e05@xxxxxxx/

I like your
+struct local_tryirq_lock
approach, but let's put it in local_lock.h ?
and it probably needs local_inc_return() instead of READ/WRITE_ONCE.
With irq and nmis it's racy.

In the meantime I think I will fix below:

> >> +#define __local_trylock_irqsave(lock, flags)                        \
> >> +    ({                                                      \
> >> +            local_irq_save(flags);                          \
> >> +            local_trylock_acquire(this_cpu_ptr(lock));      \
> >> +            1;                                              \
> >> +    })

as
#define __local_trylock_irqsave(lock, flags)                    \
        ({                                                      \
                local_irq_save(flags);                          \
                local_trylock_acquire(this_cpu_ptr(lock));      \
                !in_nmi();                                      \
        })

I think that's good enough for memcg patch 4 and
doesn't grow local_lock_t on !RT.

We can introduce

typedef struct {
        int count;
#ifdef CONFIG_DEBUG_LOCK_ALLOC
        struct lockdep_map      dep_map;
        struct task_struct      *owner;
#endif
} local_trylock_t;

and the whole set of local_trylock_lock, local_trylock_unlock,...
But that's quite a bit of code. Feels a bit overkill for memcg patch 4.
At this point it feels that adding 'int count' to existing local_lock_t
is cleaner.





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux