Re: [PATCH bpf-next v1 08/22] rqspinlock: Protect pending bit owners from stalls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 8 Jan 2025 at 00:52, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Tue, Jan 07, 2025 at 08:17:56PM +0100, Peter Zijlstra wrote:
> > On Tue, Jan 07, 2025 at 10:44:16PM +0530, Kumar Kartikeya Dwivedi wrote:
> > > On Tue, 7 Jan 2025 at 20:22, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > > >
> > > > On Tue, Jan 07, 2025 at 05:59:50AM -0800, Kumar Kartikeya Dwivedi wrote:
> > > > > +     if (val & _Q_LOCKED_MASK) {
> > > > > +             RES_RESET_TIMEOUT(ts);
> > > > > +             smp_cond_load_acquire(&lock->locked, !VAL || RES_CHECK_TIMEOUT(ts, ret));
> > > > > +     }
> > > >
> > > > Please check how smp_cond_load_acquire() works on ARM64 and then add
> > > > some words on how RES_CHECK_TIMEOUT() is still okay.
> > >
> > > Thanks Peter,
> > >
> > > The __cmpwait_relaxed bit does indeed look problematic, my
> > > understanding is that the ldxr + wfe sequence can get stuck because we
> > > may not have any updates on the &lock->locked address, and we’ll not
> > > call into RES_CHECK_TIMEOUT since that cond_expr check precedes the
> > > __cmpwait macro.
> >
> > IIRC the WFE will wake at least on every interrupt but might have an
> > inherent timeout itself, so it will make some progress, but not at a
> > speed comparable to a pure spin.

Yes, also, it is possible to have interrupts disabled (e.g. for
irqsave spin lock calls).

> >
> > > Do you have suggestions on resolving this? We want to invoke this
> > > macro as part of the waiting loop. We can have a
> > > rqspinlock_smp_cond_load_acquire that maps to no-WFE smp_load_acquire
> > > loop on arm64 and uses the asm-generic version elsewhere.
> >
> > That will make arm64 sad -- that wfe thing is how they get away with not
> > having paravirt spinlocks iirc. Also power consumption.
> >

Makes sense.

> > I've not read well enough to remember what order of timeout you're
> > looking for, but you could have the tick sample the lock like a watchdog
> > like, and write a magic 'lock' value when it is deemed stuck.
>
> Oh, there is this thread:
>
>   https://lkml.kernel.org/r/20241107190818.522639-1-ankur.a.arora@xxxxxxxxxx
>
> That seems to add exactly what you need -- with the caveat that the
> arm64 people will of course have to accept it first :-)

This seems perfect, thanks. While it adds a relaxed variant, it can be
extended with an acquire variant as well.
I will make use of this once it lands, it looks like it is pretty close.
Until then I'm thinking that falling back to a non-WFE loop is the
best course for now.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux