v1->v2: - Mark more locks as terminal. - Add a patch to remove the unused version field from lock_class. - Steal some bits from pin_count of held_lock for flags. - Add a patch to warn if a task holding a raw spinlock is acquiring an non-raw lock. The purpose of this patchset is to add a new class of locks called terminal locks and converts some of the low level raw or regular spinlocks to terminal locks. A terminal lock does not have forward dependency and it won't allow a lock or unlock operation on another lock. Two level nesting of terminal locks is allowed, though. Only spinlocks that are acquired with the _irq/_irqsave variants or acquired in an IRQ disabled context should be classified as terminal locks. Because of the restrictions on terminal locks, we can do simple checks on them without using the lockdep lock validation machinery. The advantages of making these changes are as follows: 1) It saves table entries used by the validation code and hence make it harder to overflow those tables. 2) The lockdep check in __lock_acquire() will be a little bit faster for terminal locks without using the lock validation code. In fact, it is possible to overflow some of the tables by running a variety of different workloads on a debug kernel. I have seen bug reports about exhausting MAX_LOCKDEP_KEYS, MAX_LOCKDEP_ENTRIES and MAX_STACK_TRACE_ENTRIES. This patch will help to reduce the chance of overflowing some of the tables. Below were selected output lines from the lockdep_stats files of the patched and unpatched kernels after bootup and running parallel kernel builds and some perf benchmarks. Item Unpatched kernel Patched kernel % Change ---- ---------------- -------------- -------- direct dependencies 9924 9032 -9.0% dependency chains 18258 16326 -10.6% dependency chain hlocks 73198 64927 -11.3% stack-trace entries 110502 103225 -6.6% There were some reductions in the size of the lockdep tables. They were not significant, but it is still a good start to rein in the number of entries in those tables to make it harder to overflow them. In term of performance, there isn't that much noticeable differences in both the kernel build and perf benchmark. Low level locking microbenchmark with 4 locking threads shows the following locking rates on a Haswell system: Kernel Lock Rate ------ ---- ---- Unpatched kernel Regular lock 2,288 kop/s Patched kernel Regular lock 2,352 kop/s Terminal lock 2,512 kop/s I was not sure why there was a slight performance improvement with the patched kernel. However, comparing the regular and terminal lock results, there was a slight 7% improvement in locking throughput for terminal locks. Looking at the perf ouput for regular lock: 5.43% run-locktest [kernel.vmlinux] [k] __lock_acquire 4.65% run-locktest [kernel.vmlinux] [k] lock_contended 2.80% run-locktest [kernel.vmlinux] [k] lock_acquired 2.53% run-locktest [kernel.vmlinux] [k] lock_release 1.42% run-locktest [kernel.vmlinux] [k] mark_lock For terminal lock: 5.00% run-locktest [kernel.vmlinux] [k] __lock_acquire 4.66% run-locktest [kernel.vmlinux] [k] lock_contended 2.88% run-locktest [kernel.vmlinux] [k] lock_acquired 2.55% run-locktest [kernel.vmlinux] [k] lock_release 1.34% run-locktest [kernel.vmlinux] [k] mark_lock The __lock_acquire() function ran a bit faster with terminal lock, but the other lockdep functions remain more or less the same in term of performance. In term internal lockdep structure sizes, there should be no size increase for 64-bit architectures with CONFIG_LOCK_STAT defined. The lockdep_map structure will increase in size for 32-bit architectures or when CONFIG_LOCK_STAT isn't defined. Waiman Long (17): locking/lockdep: Remove version from lock_class structure locking/lockdep: Rework lockdep_set_novalidate_class() locking/lockdep: Add a new terminal lock type locking/lockdep: Add DEFINE_TERMINAL_SPINLOCK() and related macros printk: Mark logbuf_lock & console_owner_lock as terminal locks debugobjects: Mark pool_lock as a terminal lock debugobjects: Move printk out of db lock critical sections locking/lockdep: Add support for nestable terminal locks debugobjects: Make object hash locks nestable terminal locks lib/stackdepot: Make depot_lock a terminal spinlock locking/rwsem: Mark rwsem.wait_lock as a terminal lock cgroup: Mark the rstat percpu lock as terminal mm/kasan: Make quarantine_lock a terminal lock dma-debug: Mark free_entries_lock as terminal kernfs: Mark kernfs_open_node_lock as terminal lock delay_acct: Mark task's delays->lock as terminal spinlock locking/lockdep: Check raw/non-raw locking conflicts fs/kernfs/file.c | 2 +- include/linux/lockdep.h | 59 +++++++++++++++++++++++---- include/linux/rwsem.h | 11 +++++- include/linux/spinlock_types.h | 34 ++++++++++------ kernel/cgroup/rstat.c | 9 ++++- kernel/delayacct.c | 4 +- kernel/dma/debug.c | 2 +- kernel/locking/lockdep.c | 81 ++++++++++++++++++++++++++++++++------ kernel/locking/lockdep_internals.h | 5 +++ kernel/locking/lockdep_proc.c | 11 +++++- kernel/locking/rwsem-xadd.c | 1 + kernel/locking/spinlock_debug.c | 1 + kernel/printk/printk.c | 4 +- kernel/printk/printk_safe.c | 2 +- lib/debugobjects.c | 70 ++++++++++++++++++++++---------- lib/stackdepot.c | 2 +- mm/kasan/quarantine.c | 2 +- 17 files changed, 235 insertions(+), 65 deletions(-) -- 1.8.3.1