Hi Christoph, On Tue, May 16, 2023 at 10:09:02AM +0200, Christoph Lameter wrote: > The patchset still modifies the semantics of this_cpu operations semantics > replacing the lockless RMV operations with locked ones. It does that to follow the pre-existing kernel convention: function-name LOCK prefix cmpxchg YES cmpxchg_local NO So the patchset introduces: function-name LOCK prefix this_cpu_cmpxchg YES this_cpu_cmpxchg_local NO > One of the > rationales for the use this_cpu operations is their efficiency since > locked RMV atomics are avoided. And there is the freedom to choose between this_cpu_cmpxchg and this_cpu_cmpxchg_local (depending on intended usage). > This patchset destroys that functionality. Patch 6 is Subject: [PATCH v8 06/13] add this_cpu_cmpxchg_local and asm-generic definitions Which adds this_cpu_cmpxchg_local Patch 7 converts all other this_cmpxchg users (except the vmstat ones) [PATCH v8 07/13] convert this_cpu_cmpxchg users to this_cpu_cmpxchg_local So the non-LOCK'ed behaviour is maintained for existing users. > If you want locked RMV semantics then use them through cmpxchg() and > friends. Do not modify this_cpu operations by changing the implementation > in the arch code. But then it would be necessary to disable preemption here: static inline void mod_zone_state(struct zone *zone, enum zone_stat_item item, long delta, int overstep_mode) { struct per_cpu_zonestat __percpu *pcp = zone->per_cpu_zonestats; s32 __percpu *p = pcp->vm_stat_diff + item; long o, n, t, z; do { z = 0; /* overflow to zone counters */ /* * The fetching of the stat_threshold is racy. We may apply * a counter threshold to the wrong the cpu if we get * rescheduled while executing here. However, the next * counter update will apply the threshold again and * therefore bring the counter under the threshold again. * * Most of the time the thresholds are the same anyways * for all cpus in a zone. */ t = this_cpu_read(pcp->stat_threshold); o = this_cpu_read(*p); n = delta + o; if (abs(n) > t) { int os = overstep_mode * (t >> 1); /* Overflow must be added to zone counters */ z = n + os; n = -os; } } while (this_cpu_cmpxchg(*p, o, n) != o); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ if (z) zone_page_state_add(z, zone, item); } Earlier you objected to disabling preemption on this codepath (which is what led to this patchset in the first place): "Using preemption is a way to make this work correctly. However, doing so would sacrifice the performance, low impact and the scalability of the vm counters." So it seems a locked, this_cpu function which does lock cmxpchg is desired. Perhaps you disagree with the this_cpu_cmpxchg_local/this_cpu_cmpxchg naming?