On Sun, Dec 01, 2019 at 04:21:19PM +0100, Borislav Petkov wrote: > On Sun, Dec 01, 2019 at 04:10:11PM +0100, Borislav Petkov wrote: > > So lemme first confirm it really is caused by those patches. > > Yeah, those patches are causing it. Tried your current master - it is OK > - and then applied Andrew's patches I was CCed on, ontop, and I got in a > VM: > > VFS: Mounted root (ext4 filesystem) readonly on device 8:2. > devtmpfs: mounted > Freeing unused kernel image (initmem) memory: 664K > Write protecting kernel text and read-only data: 18164k > NX-protecting the kernel data: 7416k > BUG: kernel NULL pointer dereference, address: 00000014 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > *pdpt = 0000000000000000 *pde = f000ff53f000ff53 > Oops: 0000 [#1] PREEMPT SMP PTI > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0+ #3 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014 > EIP: __lock_acquire.isra.0+0x2e8/0x4e0 > Code: e8 bd a1 2f 00 85 c0 74 11 8b 1d 08 8f 26 c5 85 db 0f 84 05 1a 00 00 8d 76 00 31 db 8d 65 f4 89 d8 5b 5e 5f 5d c3 8d 74 26 00 <8b> 44 90 04 85 c0 0f 85 4c fd ff ff e9 33 fd ff ff 8d b4 26 00 00 > EAX: 00000010 EBX: 00000010 ECX: 00000001 EDX: 00000000 > ESI: f1070040 EDI: f1070040 EBP: f1073e04 ESP: f1073de0 > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010097 > CR0: 80050033 CR2: 00000014 CR3: 05348000 CR4: 001406b0 > Call Trace: > lock_acquire+0x42/0x60 > ? __walk_page_range+0x4d9/0x590 > _raw_spin_lock+0x22/0x40 > ? __walk_page_range+0x4d9/0x590 > __walk_page_range+0x4d9/0x590 Ok, some more staring. That offset is: # mm/pagewalk.c:31: pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); sall $5, %eax #, tmp235 addl -64(%ebp), %eax # %sfp, tmp236 call page_address # addl %eax, %esi # tmp306, __pte # ./include/linux/spinlock.h:338: raw_spin_lock(&lock->rlock); movl -76(%ebp), %eax # %sfp, call _raw_spin_lock # movl %edi, %edx # start, start movl %ebx, -64(%ebp) # __boundary, %sfp movl -80(%ebp), %edi # %sfp, ops movl %esi, -40(%ebp) # __pte, %sfp i.e., pte_offset_map_lock() and I *think* that ptl thing is NULL. The Code section decodes to: Code: e8 bd a1 2f 00 85 c0 74 11 8b 1d 08 8f 26 c5 85 db 0f 84 05 1a 00 00 8d 76 00 31 db 8d 65 f4 89 d8 5b 5e 5f 5d c3 8d 74 26 00 <8b> 44 90 04 85 c0 0f 85 4c fd ff ff e9 33 fd ff ff 8d b4 26 00 00 All code ======== 0: e8 bd a1 2f 00 callq 0x2fa1c2 5: 85 c0 test %eax,%eax 7: 74 11 je 0x1a 9: 8b 1d 08 8f 26 c5 mov -0x3ad970f8(%rip),%ebx # 0xffffffffc5268f17 f: 85 db test %ebx,%ebx 11: 0f 84 05 1a 00 00 je 0x1a1c 17: 8d 76 00 lea 0x0(%rsi),%esi 1a: 31 db xor %ebx,%ebx 1c: 8d 65 f4 lea -0xc(%rbp),%esp 1f: 89 d8 mov %ebx,%eax 21: 5b pop %rbx 22: 5e pop %rsi 23: 5f pop %rdi 24: 5d pop %rbp 25: c3 retq 26: 8d 74 26 00 lea 0x0(%rsi,%riz,1),%esi 2a:* 8b 44 90 04 mov 0x4(%rax,%rdx,4),%eax <-- trapping instruction 2e: 85 c0 test %eax,%eax 30: 0f 85 4c fd ff ff jne 0xfffffffffffffd82 36: e9 33 fd ff ff jmpq 0xfffffffffffffd6e 3b: 8d .byte 0x8d 3c: b4 26 which is this corresponding piece in __lock_acquire(): call debug_locks_off # # kernel/locking/lockdep.c:3775: if (!debug_locks_off()) testl %eax, %eax # tmp325 je .L562 #, # kernel/locking/lockdep.c:3777: if (debug_locks_silent) movl debug_locks_silent, %ebx # debug_locks_silent, <retval> # kernel/locking/lockdep.c:3777: if (debug_locks_silent) testl %ebx, %ebx # <retval> je .L642 #, .p2align 4,,10 .p2align 3 .L562: # kernel/locking/lockdep.c:3826: return 0; xorl %ebx, %ebx # <retval> .L557: # kernel/locking/lockdep.c:3982: } leal -12(%ebp), %esp #, movl %ebx, %eax # <retval>, popl %ebx # popl %esi # popl %edi # popl %ebp # ret .p2align 4,,10 .p2align 3 .L649: # kernel/locking/lockdep.c:3832: class = lock->class_cache[subclass]; movl 4(%eax,%edx,4), %eax # lock_7(D)->class_cache, class ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (the LEA above is NOP padding) and %eax and %edx are both NULL. i.e., that thing: if (subclass < NR_LOCKDEP_CACHING_CLASSES) class = lock->class_cache[subclass]; ^^^^^^^^^^^^^^^ AFAICT, of course. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette