On 2019-04-24 05:15:52 [-0700], Matthew Wilcox wrote: > On Wed, Apr 24, 2019 at 01:12:04PM +0200, Sebastian Andrzej Siewior wrote: > > The swap code synchronizes its access to the (four) pagevec struct > > (which is allocated per-CPU) by disabling preemption. This works and the > > one struct needs to be accessed from interrupt context is protected by > > disabling interrupts. This was manually audited and there is no lockdep > > coverage for this. > > There is one case where the per-CPU of a remote CPU needs to be accessed > > and this is solved by started a worker on the remote CPU and waiting for > > it to finish. > > > > In v1 [0] it was attempted to add per-CPU spinlocks for the access to > > struct. This would add lockdep coverage and access from a remote CPU so > > the worker wouldn't be required. > > >From my point of view, what is missing from this description is why we > want to be able to access these structs from a remote CPU. It's explained > a little better in the 4/4 changelog, but I don't see any numbers that > suggest what kinds of gains we might see (eg "reduces power consumption > by x% on a particular setup", or even "average length of time in idle > extended from x ms to y ms"). Pulling out a CPU from idle or userland computation looks bad. In the first series I had numbers how long it takes to compute the loop for all per-CPU data from one CPU vs the workqueue. Somehow the uncontended lock was bad as per krobot report while I never got stable numbers from that test. The other motivation is RT where we need proper locking and can't use that preempt-disable based locking. Sebastian