On Wed, Jul 28, 2021 at 10:58:08AM -0700, Linus Torvalds wrote: > On Wed, Jul 28, 2021 at 10:37 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > > > > +/* > > + * Increment the current CPU's rcu_data structure's ->dynticks field > > + * with ordering. Return the new value. > > + */ > > +static noinstr unsigned long rcu_dynticks_inc(int incby) > > +{ > > + return arch_atomic_add_return(incby, this_cpu_ptr(&rcu_data.dynticks)); > > +} > > Maybe inline? I bet the compiler does this already, but why give it the choice? I will add inline. > But more I reacted to how we sadly don't have percpu atomics. They'd > be fairly easy to add on x86, but I guess it's not a huge deal. > > And hey, if this is pretty much the only place that would use them, I > guess that's not much of an issue. I did a git grep for "atomic_.*this_cpu_ptr" in -rcu, and came up with this: arch/s390/kernel/time.c: atomic_t *sw_ptr = this_cpu_ptr(&clock_sync_word); arch/s390/kernel/time.c: atomic_t *sw_ptr = this_cpu_ptr(&clock_sync_word); arch/x86/xen/spinlock.c: atomic_t *nest_cnt = this_cpu_ptr(&xen_qlock_wait_nest); drivers/irqchip/irq-apple-aic.c: atomic_andnot(irq_bit, this_cpu_ptr(&aic_vipi_enable)); drivers/irqchip/irq-apple-aic.c: atomic_or(irq_bit, this_cpu_ptr(&aic_vipi_enable)); drivers/irqchip/irq-apple-aic.c: if (atomic_read(this_cpu_ptr(&aic_vipi_flag)) & irq_bit) drivers/irqchip/irq-apple-aic.c: enabled = atomic_read(this_cpu_ptr(&aic_vipi_enable)); drivers/irqchip/irq-apple-aic.c: firing = atomic_fetch_andnot(enabled, this_cpu_ptr(&aic_vipi_flag)) & enabled; kernel/events/core.c: if (atomic_read(this_cpu_ptr(&perf_cgroup_events))) kernel/events/core.c: if (atomic_read(this_cpu_ptr(&perf_cgroup_events))) kernel/rcu/rcuscale.c: atomic_dec(this_cpu_ptr(&n_async_inflight)); kernel/rcu/rcuscale.c: if (rhp && atomic_read(this_cpu_ptr(&n_async_inflight)) < gp_async_max) { kernel/rcu/rcuscale.c: atomic_inc(this_cpu_ptr(&n_async_inflight)); kernel/rcu/tree.c: return arch_atomic_add_return(incby, this_cpu_ptr(&rcu_data.dynticks)); kernel/rcu/tree.c: return !(atomic_read(this_cpu_ptr(&rcu_data.dynticks)) & 0x1); kernel/trace/ring_buffer.c: if (atomic_inc_return(this_cpu_ptr(&checking)) != 1) kernel/trace/ring_buffer.c: atomic_dec(this_cpu_ptr(&checking)); Given that this mostly moves text from the first argument to the function name, I suspect that we would need quite a few instances of any given atomic operation to make the added lines for all the definitions and the documentation worth it. Maybe for atomic_read()? /** * atomic_read_this_cpu - atomically read from this CPU's variable * @p: Pointer to per-CPU variable. * * Do an atomic_read() from the current CPU's instance of the * specified per-CPU variable. */ #define atomic_read_this_cpu(p) atomic_read(this_cpu_ptr(p)) But atomic_read_this_cpu(&rcu_data.dynticks) isn't all that much shorter than atomic_read(this_cpu_ptr(&rcu_data.dynticks)). Thanx, Paul