On Fri, 23 Aug 2024, Will Deacon wrote: > > +#ifdef CONFIG_ARCH_HAS_ACQUIRE_RELEASE > > +#define raw_read_seqcount_begin(s) \ > > +({ \ > > + unsigned _seq; \ > > + \ > > + while ((_seq = seqprop_sequence_acquire(s)) & 1) \ > > + cpu_relax(); \ > > It would also be interesting to see whether smp_cond_load_acquire() > performs any better that this loop in the !RT case. The hack to do this follows. Kernel boots but no change in cycles. Also builds a kernel just fine. Another benchmark may be better. All my synthetic tests do is run the function calls in a loop in parallel on multiple cpus. The main effect here may be the reduction of power since the busyloop is no longer required. I would favor a solution like this. But the patch is not clean given the need to get rid of the const attribute with a cast. Index: linux/include/linux/seqlock.h =================================================================== --- linux.orig/include/linux/seqlock.h +++ linux/include/linux/seqlock.h @@ -325,9 +325,9 @@ SEQCOUNT_LOCKNAME(mutex, struct m #define raw_read_seqcount_begin(s) \ ({ \ unsigned _seq; \ + seqcount_t *e = seqprop_ptr((struct seqcount_spinlock *)s); \ \ - while ((_seq = seqprop_sequence_acquire(s)) & 1) \ - cpu_relax(); \ + _seq = smp_cond_load_acquire(&e->sequence, ((e->sequence & 1) == 0)); \ \ kcsan_atomic_next(KCSAN_SEQLOCK_REGION_MAX); \ _seq; \