On Tue, Feb 04, 2020 at 11:54:02AM -0500, Alex Kogan wrote: > > On Feb 3, 2020, at 10:47 AM, Waiman Long <longman@xxxxxxxxxx> wrote: > > > > On 2/3/20 10:28 AM, Peter Zijlstra wrote: > >> On Mon, Feb 03, 2020 at 09:59:12AM -0500, Waiman Long wrote: > >>> On 2/3/20 8:45 AM, Peter Zijlstra wrote: > >>>> Presumably you have a workload where CNA is actually a win? That is, > >>>> what inspired you to go down this road? Which actual kernel lock is so > >>>> contended on NUMA machines that we need to do this? > There are quite a few actually. files_struct.file_lock, file_lock_context.flc_lock > and lockref.lock are some concrete examples that get very hot in will-it-scale > benchmarks. Right, that's all a variant of banging on the same resources across nodes. I'm not sure there's anything fundamental we can fix there. > And then there are spinlocks in __futex_data.queues, > which get hot when applications have contended (pthread) locks — > LevelDB is an example. A numa aware rework of futexes has been on the todo list for years :/ > Our initial motivation was based on an observation that kernel qspinlock is not > NUMA-aware. So what, you may ask. Much like people realized in the past that > global spinning is bad for performance, and they switched from ticket lock to > locks with local spinning (e.g., MCS), I think everyone would agree these days that > bouncing a lock (and cache lines in general) across numa nodes is similarly bad. > And as CNA demonstrates, we are easily leaving 2-3x speedups on the table by > doing just that with the current qspinlock. Actual benchmarks with performance numbers are required. It helps motivate the patches as well as gives reviewers clues on how to reproduce / inspect the claims made.