Suren, When running kselftest mm, I believe I've come across a lockdep issue with the per-vma locking pagefault: [ 226.105499] WARNING: CPU: 1 PID: 1907 at include/linux/mmap_lock.h:71 handle_userfault+0x34d/0xff0 [ 226.106517] Modules linked in: [ 226.107060] CPU: 1 PID: 1907 Comm: uffd-unit-tests Not tainted 6.5.0-rc1+ #636 [ 226.108099] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-5 04/01/2014 [ 226.109626] RIP: 0010:handle_userfault+0x34d/0xff0 [ 226.113056] Code: 00 48 85 c0 0f 85 d4 fe ff ff 4c 89 f7 e8 bb 58 ea ff 0f 0b 31 f6 49 8d be a0 01 00 00 e8 0b 8b 53 01 85 c0 0f 85 00 fe ff ff <0f> 0b e9 f9 fd ff ff 49 8d be a0 01 00 00 be ff ff ff ff e8 eb 8a [ 226.115798] RSP: 0000:ffff888113a8fbf0 EFLAGS: 00010246 [ 226.116570] RAX: 0000000000000000 RBX: ffff888113a8fdc8 RCX: 0000000000000001 [ 226.117630] RDX: 0000000000000000 RSI: ffffffff97a70220 RDI: ffffffff97c316e0 [ 226.118654] RBP: ffff88811de7c1e0 R08: 0000000000000000 R09: ffffed1022991400 [ 226.119508] R10: ffff888114c8a003 R11: 0000000000000000 R12: 0000000000000200 [ 226.120471] R13: ffff88811de7c1f0 R14: ffff888106ebec00 R15: 0000000000001000 [ 226.121521] FS: 00007f226ec0f740(0000) GS:ffff88836f280000(0000) knlGS:0000000000000000 [ 226.122543] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 226.123242] CR2: 00007f226ac0f028 CR3: 00000001088a4001 CR4: 0000000000370ee0 [ 226.124075] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 226.125073] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 226.126308] Call Trace: [ 226.127473] <TASK> [ 226.128001] ? __warn+0x9c/0x1f0 [ 226.129005] ? handle_userfault+0x34d/0xff0 [ 226.129940] ? report_bug+0x1f2/0x220 [ 226.130700] ? handle_bug+0x3c/0x70 [ 226.131234] ? exc_invalid_op+0x13/0x40 [ 226.131827] ? asm_exc_invalid_op+0x16/0x20 [ 226.132516] ? handle_userfault+0x34d/0xff0 [ 226.133193] ? __pfx_do_raw_spin_lock+0x10/0x10 [ 226.133862] ? find_held_lock+0x83/0xa0 [ 226.134602] ? do_anonymous_page+0x81f/0x870 [ 226.135314] ? __pfx_handle_userfault+0x10/0x10 [ 226.136226] ? __pte_offset_map_lock+0xd4/0x160 [ 226.136958] ? do_raw_spin_unlock+0x92/0xf0 [ 226.137547] ? preempt_count_sub+0xf/0xc0 [ 226.138011] ? _raw_spin_unlock+0x24/0x40 [ 226.138594] ? do_anonymous_page+0x81f/0x870 [ 226.139239] __handle_mm_fault+0x40a/0x470 [ 226.139749] ? __pfx___handle_mm_fault+0x10/0x10 [ 226.140516] handle_mm_fault+0xe9/0x270 [ 226.141015] do_user_addr_fault+0x1a9/0x810 [ 226.141638] exc_page_fault+0x58/0xe0 [ 226.142101] asm_exc_page_fault+0x22/0x30 [ 226.142713] RIP: 0033:0x561107c4967e [ 226.143391] Code: 48 89 85 18 ff ff ff e9 e2 00 00 00 48 8b 15 49 a0 00 00 48 8b 05 2a a0 00 00 48 0f af 45 f8 48 83 c0 2f 48 01 d0 48 83 e0 f8 <48> 8b 00 48 89 45 c8 48 8b 05 54 a0 00 00 48 8b 55 f8 48 c1 e2 03 [ 226.145946] RSP: 002b:00007ffee4f22120 EFLAGS: 00010206 [ 226.146745] RAX: 00007f226ac0f028 RBX: 00007ffee4f22448 RCX: 00007f226eca1bb4 [ 226.147912] RDX: 00007f226ac0f000 RSI: 0000000000000001 RDI: 0000000000000000 [ 226.149093] RBP: 00007ffee4f22220 R08: 0000000000000000 R09: 0000000000000000 [ 226.150218] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000000 [ 226.151313] R13: 00007ffee4f22458 R14: 0000561107c52dd8 R15: 00007f226ee34020 [ 226.152464] </TASK> [ 226.152802] irq event stamp: 3177751 [ 226.153348] hardirqs last enabled at (3177761): [<ffffffff95d9fa69>] __up_console_sem+0x59/0x80 [ 226.154679] hardirqs last disabled at (3177772): [<ffffffff95d9fa4e>] __up_console_sem+0x3e/0x80 [ 226.155998] softirqs last enabled at (3177676): [<ffffffff95ccea54>] irq_exit_rcu+0x94/0xf0 [ 226.157364] softirqs last disabled at (3177667): [<ffffffff95ccea54>] irq_exit_rcu+0x94/0xf0 [ 226.158721] ---[ end trace 0000000000000000 ]--- CONFIG_PER_VMA_LOCK calls handle_mm_fault() in mm/memory.c handle_mm_fault() may have an outdated comment, depending on what "mm semaphore" means: * By the time we get here, we already hold the mm semaphore __handle_mm_fault+0x40a/0x470: do_pte_missing at mm/memory.c:3672 (inlined by) handle_pte_fault at mm/memory.c:4955 (inlined by) __handle_mm_fault at mm/memory.c:5095 handle_userfault+0x34d/0xff0: mmap_assert_write_locked at include/linux/mmap_lock.h:71 (inlined by) __is_vma_write_locked at include/linux/mm.h:673 (inlined by) vma_assert_locked at include/linux/mm.h:714 (inlined by) assert_fault_locked at include/linux/mm.h:747 (inlined by) handle_userfault at fs/userfaultfd.c:440 It looks like vma_assert_locked() is causing a problem if the mmap write lock is not held in write mode. It looks to be an easy fix of checking the mmap_lock is held in write mode in every other call location BUT the vma_assert_locked() path? Thanks, Liam