On Tue, Nov 05, 2024 at 12:30:37PM -0600, Haris Okanovic wrote: > diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h > index d4f581c1e21d..112027eabbfc 100644 > --- a/include/asm-generic/barrier.h > +++ b/include/asm-generic/barrier.h > @@ -256,6 +256,31 @@ do { \ > }) > #endif > > +/** > + * smp_vcond_load_relaxed() - (Spin) wait until an expected value at address > + * with no ordering guarantees. Spins until `(*addr & mask) == val` or > + * `nsecs` elapse, and returns the last observed `*addr` value. > + * > + * @nsecs: timeout in nanoseconds FWIW, I don't mind the relative timeout, it makes the API easier to use. Yes, it may take longer in absolute time if the thread is scheduled out before local_clock_noinstr() is read but the same can happen in the caller anyway. It's similar to udelay(), it can take longer if the thread is scheduled out. > + * @addr: pointer to an integer > + * @mask: a bit mask applied to read values > + * @val: Expected value with mask > + */ > +#ifndef smp_vcond_load_relaxed > +#define smp_vcond_load_relaxed(nsecs, addr, mask, val) ({ \ > + const u64 __start = local_clock_noinstr(); \ > + u64 __nsecs = (nsecs); \ > + typeof(addr) __addr = (addr); \ > + typeof(*__addr) __mask = (mask); \ > + typeof(*__addr) __val = (val); \ > + typeof(*__addr) __cur; \ > + smp_cond_load_relaxed(__addr, ( \ > + (VAL & __mask) == __val || \ > + local_clock_noinstr() - __start > __nsecs \ > + )); \ > +}) The generic implementation has the same problem as Ankur's current series. smp_cond_load_relaxed() can't wait on anything other than the variable at __addr. If it goes into a WFE, there's nothing executed to read the timer and check for progress. Any generic implementation of such function would have to use cpu_relax() and polling. -- Catalin