On Mon, Oct 15, 2018 at 10:50:48AM +0100, Mel Gorman wrote: > On Fri, Oct 12, 2018 at 09:21:41AM +0200, Vlastimil Babka wrote: > > On 9/14/18 4:59 PM, Sebastian Andrzej Siewior wrote: > > I think this evaluation is missing the other side of the story, and > > that's the cost of using a spinlock (even uncontended) instead of > > disabling preemption. The expectation for LRU pagevec is that the local > > operations will be much more common than draining of other CPU's, so > > it's optimized for the former. > > > > Agreed, the drain operation should be extremely rare except under heavy > memory pressure, particularly if mixed with THP allocations. The overall > intent seems to be improving lockdep coverage but I don't think we > should take a hit in the common case just to get that coverage. Bear in > mind that the main point of the pagevec (whether it's true or not) is to > avoid the much heavier LRU lock. So indeed, if the only purpose of this patch were to make lockdep wiser, a pair of spin_lock_acquire() / spin_unlock_release() would be enough to teach it and would avoid the overhead. Now another significant incentive behind this change is to improve CPU isolation. Workloads relying on owning the entire CPU without being disturbed are interested in this as it allows to offload some noise. It's no big deal for those who can tolerate rare events but often CPU isolation is combined with deterministic latency requirements. So, I'm not saying this per-CPU spinlock is necessarily the right answer, I don't know that code enough to have an opinion, but I still wish we can find a solution. Thanks.