On Wed, Jan 11, 2023 at 09:42:18AM +0100, Christoph Lameter wrote: > On Tue, 10 Jan 2023, Marcelo Tosatti wrote: > > > > The basic primitives add a lot of weight. > > > > Can't see any alternative given the necessity to avoid interruption > > by the work to sync per-CPU vmstats to global vmstats. > > this_cpu operations are designed to operate on a *single* value (a counter) and can > be run on an arbitrary cpu, There is no preemption or interrupt > disable required since the counters of all cpus will be added up at the > end. > > You want *two* values (the counter and the dirty flag) to be modified > together and want to use the counters/flag to identify the cpu where > these events occurred. this_cpu_xxx operations are not suitable for that > purpose. You would need a way to ensure that both operations occur on the > same cpu. Which is either preempt_disable (CONFIG_HAVE_CMPXCHG_LOCAL case), or local_irq_disable (!CONFIG_HAVE_CMPXCHG_LOCAL case). > > > > And the pre cpu atomic updates operations require the modification > > > of multiple values. The operation > > > cannot be "atomic" in that sense anymore and we need some other form of > > > synchronization that can > > > span multiple instructions. > > > > So use this_cpu_cmpxchg() to avoid the overhead. Since we can no longer > > count on preremption being disabled we still have some minor issues. > > The fetching of the counter thresholds is racy. > > A threshold from another cpu may be applied if we happen to be > > rescheduled on another cpu. However, the following vmstat operation > > will then bring the counter again under the threshold limit. > > > > Those small issues are gone, OTOH. > > Well you could use this_cpu_cmpxchg128 to update a 64 bit counter and a > flag at the same time. But then you transform the "per-CPU vmstat is dirty" bit (bool) into a number of flags that must be scanned (when returning to userspace). Which increases the overhead of a fast path (return to userspace). > Otherwise you will have to switch off preemption or > interrupts when incrementing the counters and updating the dirty flag. > > Thus you do not really need the this_cpu operations anymore. It would > best to use a preempt_disable section and uuse C operators -- ++ for the > counter and do regular assignment for the flag. OK, can replace this_cpu operations with this_cpu_ptr + standard C operators (and in fact can do that for interrupt disabled functions as well, that is CONFIG_HAVE_CMPXCHG_LOCAL not defined). Is that it?