On Wed, Aug 28 2024 at 10:15, Christoph Lameter wrote: > On Fri, 23 Aug 2024, Thomas Gleixner wrote: > >> This all can be done without the extra copies of the counter >> accessors. Uncompiled patch below. > > Great. Thanks. Tried it too initially but could not make it work right. > > One thing that we also want is the use of the smp_cond_load_acquire to > have the cpu power down while waiting for a cacheline change. > > The code has several places where loops occur when the last bit is set in > the seqcount. > > We could use smp_cond_load_acquire in load_sequence() but what do we do > about the loops at the higher level? Also this does not sync with the lock > checking logic. Come on. It's not rocket science to figure that out. Uncompiled delta patch below. Thanks, tglx --- --- a/include/linux/seqlock.h +++ b/include/linux/seqlock.h @@ -23,6 +23,13 @@ #include <asm/processor.h> +#ifdef CONFIG_ARCH_HAS_ACQUIRE_RELEASE +# define USE_LOAD_ACQUIRE true +# define USE_COND_LOAD_ACQUIRE !IS_ENABLED(CONFIG_PREEMPT_RT) +#else +# define USE_LOAD_ACQUIRE false +# define USE_COND_LOAD_ACQUIRE false +#endif /* * The seqlock seqcount_t interface does not prescribe a precise sequence of * read begin/retry/end. For readers, typically there is a call to @@ -134,10 +141,13 @@ static inline void seqcount_lockdep_read static __always_inline unsigned __seqprop_load_sequence(const seqcount_t *s, bool acquire) { - if (acquire && IS_ENABLED(CONFIG_ARCH_HAS_ACQUIRE_RELEASE)) - return smp_load_acquire(&s->sequence); - else + if (!acquire || !USE_LOAD_ACQUIRE) return READ_ONCE(s->sequence); + + if (USE_COND_LOAD_ACQUIRE) + return smp_cond_load_acquire(&s->sequence, (s->sequence & 1) == 0); + + return smp_load_acquire(&s->sequence); } /* @@ -283,8 +293,12 @@ SEQCOUNT_LOCKNAME(mutex, struct m ({ \ unsigned __seq; \ \ - while ((__seq = seqprop_sequence(s, acquire)) & 1) \ - cpu_relax(); \ + if (acquire && USE_COND_LOAD_ACQUIRE) { \ + __seq = seqprop_sequence(s, acquire); \ + } else { \ + while ((__seq = seqprop_sequence(s, acquire)) & 1) \ + cpu_relax(); \ + } \ \ kcsan_atomic_next(KCSAN_SEQLOCK_REGION_MAX); \ __seq; \