Hi all, As core counts are rapidly expanding over the next four years, Namhyung and I were looking at global locks that we're already seeing high contention on even today. Some of these are not MM specific: - cgroup_mutex - cgroup_threadgroup_rwsem - tasklist_lock - kernfs_mutex (although should now be substantially better with the kernfs_locks array) Others *are* MM specific: - list_lrus_mutex - pcpu_drain_mutex - shrinker_mutex (formerly shrinker_rwsem) - vmap_purge_lock - slab_mutex This is only looking at fleet data for global static locks, not locks like zone->lock that get dynamically allocated. (mmap_lock was substantially improved by per-vma locking, although does show up for very large vmas.) Couple questions: (1) How are people quantifying these pain points, if at all, in synthetic testing? Any workloads or benchmarks that are really good at doing this in the lab beyond the traditional will-it-scale? (The above is from production data.) (2) Is anybody working on any of the above global locks? Trying to surface gaps for locks that will likely become even more painful in the coming years. Thanks!