On Thu, Mar 02, 2023 at 11:42:57AM +0100, David Hildenbrand wrote: > On 09.02.23 16:01, Marcelo Tosatti wrote: > > Goal is to have vmstat_shepherd to transfer from > > per-CPU counters to global counters remotely. For this, > > an atomic this_cpu_cmpxchg is necessary. > > > > Following the kernel convention for cmpxchg/cmpxchg_local, > > change ARM's this_cpu_cmpxchg_ helpers to be atomic, > > and add this_cpu_cmpxchg_local_ helpers which are not atomic. > > > > Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx> > > > > Index: linux-vmstat-remote/arch/arm64/include/asm/percpu.h > > =================================================================== > > --- linux-vmstat-remote.orig/arch/arm64/include/asm/percpu.h > > +++ linux-vmstat-remote/arch/arm64/include/asm/percpu.h > > @@ -232,13 +232,23 @@ PERCPU_RET_OP(add, add, ldadd) > > _pcp_protect_return(xchg_relaxed, pcp, val) > > #define this_cpu_cmpxchg_1(pcp, o, n) \ > > - _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) > > + _pcp_protect_return(cmpxchg, pcp, o, n) > > #define this_cpu_cmpxchg_2(pcp, o, n) \ > > - _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) > > + _pcp_protect_return(cmpxchg, pcp, o, n) > > #define this_cpu_cmpxchg_4(pcp, o, n) \ > > - _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) > > + _pcp_protect_return(cmpxchg, pcp, o, n) > > #define this_cpu_cmpxchg_8(pcp, o, n) \ > > + _pcp_protect_return(cmpxchg, pcp, o, n) > > + > > +#define this_cpu_cmpxchg_local_1(pcp, o, n) \ > > _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) > > +#define this_cpu_cmpxchg_local_2(pcp, o, n) \ > > + _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) > > +#define this_cpu_cmpxchg_local_4(pcp, o, n) \ > > + _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) > > +#define this_cpu_cmpxchg_local_8(pcp, o, n) \ > > + _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) > > + > > Call me confused (not necessarily your fault :) ). > > We have cmpxchg_local, cmpxchg_relaxed and cmpxchg. this_cpu_cmpxchg_local_* > now calls ... *drumroll* ... cmpxchg_relaxed. > IIUC, cmpxchg_local is only guaranteed to be atomic WRO the current CPU > (especially, protection against interrupts when the operation is implemented > using multiple instructions). We do have a generic implementation that > disables/enables interrupts. > > IIUC, cmpxchg_relaxed an atomic update without any memory ordering > guarantees (in contrast to cmpxchg, cmpxchg_acquire, cmpxchg_acquire). We > default to arch_cmpxchg if we don't have arch_cmpxchg_relaxed. arch_cmpxchg > defaults to arch_cmpxchg_local, if not supported. > > > Naturally I wonder: > > (a) Should these new variants be rather called > this_cpu_cmpxchg_relaxed_* ? No: it happens that on ARM-64 cmpxchg_local == cmpxchg_relaxed. See cf10b79a7d88edc689479af989b3a88e9adf07ff. > (b) Should these new variants rather call the "_local" variant? They probably should. But this patchset maintains the current behaviour of this_cpu_cmpxch (for this_cpu_cmpxch_local), which was: #define this_cpu_cmpxchg_1(pcp, o, n) \ - _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) + _pcp_protect_return(cmpxchg, pcp, o, n) #define this_cpu_cmpxchg_2(pcp, o, n) \ - _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) + _pcp_protect_return(cmpxchg, pcp, o, n) #define this_cpu_cmpxchg_4(pcp, o, n) \ - _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) + _pcp_protect_return(cmpxchg, pcp, o, n) #define this_cpu_cmpxchg_8(pcp, o, n) \ + _pcp_protect_return(cmpxchg, pcp, o, n) Thanks.