On Fri, Mar 21, 2025 at 02:21:20PM +0800, kernel test robot wrote: > commit: 6304be90cf5460f33b031e1e19cbe7ffdcbc9f66 ("[PATCH 1/5] mm: compaction: push watermark into compaction_suitable() callers") > url: https://github.com/intel-lab-lkp/linux/commits/Johannes-Weiner/mm-compaction-push-watermark-into-compaction_suitable-callers/20250314-050839 > base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything > patch link: https://lore.kernel.org/all/20250313210647.1314586-2-hannes@xxxxxxxxxxx/ > patch subject: [PATCH 1/5] mm: compaction: push watermark into compaction_suitable() callers > test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G > [ 24.321289][ T36] BUG: unable to handle page fault for address: ffff88844000c5f8 > [ 24.322631][ T36] #PF: supervisor read access in kernel mode > [ 24.323577][ T36] #PF: error_code(0x0000) - not-present page > [ 24.324482][ T36] PGD 3a01067 P4D 3a01067 PUD 0 > [ 24.325301][ T36] Oops: Oops: 0000 [#1] PREEMPT SMP PTI > [ 24.326157][ T36] CPU: 1 UID: 0 PID: 36 Comm: kcompactd0 Not tainted 6.14.0-rc6-00559-g6304be90cf54 #1 > [ 24.327631][ T36] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 > [ 24.329194][ T36] RIP: 0010:__zone_watermark_ok (mm/page_alloc.c:3256) > [ 24.330125][ T36] Code: 84 c0 78 14 4c 8b 97 48 06 00 00 45 31 db 4d 85 d2 4d 0f 4f da 4c 01 de 49 29 f1 41 f7 c0 38 02 00 00 0f 85 92 00 00 00 48 98 <48> 03 54 c7 38 49 39 d1 7e 7e b0 01 85 c9 74 7a 83 f9 0a 7f 73 48 > All code > ======== > 0: 84 c0 test %al,%al > 2: 78 14 js 0x18 > 4: 4c 8b 97 48 06 00 00 mov 0x648(%rdi),%r10 > b: 45 31 db xor %r11d,%r11d > e: 4d 85 d2 test %r10,%r10 > 11: 4d 0f 4f da cmovg %r10,%r11 > 15: 4c 01 de add %r11,%rsi > 18: 49 29 f1 sub %rsi,%r9 > 1b: 41 f7 c0 38 02 00 00 test $0x238,%r8d > 22: 0f 85 92 00 00 00 jne 0xba > 28: 48 98 cltq > 2a:* 48 03 54 c7 38 add 0x38(%rdi,%rax,8),%rdx <-- trapping instruction That would be the zone->lowmem_reserve[highest_zoneidx] deref: long int lowmem_reserve[4]; /* 0x38 0x20 */ > 2f: 49 39 d1 cmp %rdx,%r9 > 32: 7e 7e jle 0xb2 > 34: b0 01 mov $0x1,%al > 36: 85 c9 test %ecx,%ecx > 38: 74 7a je 0xb4 > 3a: 83 f9 0a cmp $0xa,%ecx > 3d: 7f 73 jg 0xb2 > 3f: 48 rex.W > > Code starting with the faulting instruction > =========================================== > 0: 48 03 54 c7 38 add 0x38(%rdi,%rax,8),%rdx > 5: 49 39 d1 cmp %rdx,%r9 > 8: 7e 7e jle 0x88 > a: b0 01 mov $0x1,%al > c: 85 c9 test %ecx,%ecx > e: 74 7a je 0x8a > 10: 83 f9 0a cmp $0xa,%ecx > 13: 7f 73 jg 0x88 > 15: 48 rex.W > [ 24.333001][ T36] RSP: 0018:ffffc90000137cd0 EFLAGS: 00010246 > [ 24.334003][ T36] RAX: 00000000000036a8 RBX: 0000000000000001 RCX: 0000000000000000 > [ 24.335270][ T36] RDX: 0000000000000006 RSI: 0000000000000000 RDI: ffff88843fff1080 and %rax and %rdx look like the swapped watermark and zoneidx (36a8 is 14k pages, or 54M, which matches a min watermark on a 16G system). So this is the bug that Hugh fixed here: https://lore.kernel.org/all/005ace8b-07fa-01d4-b54b-394a3e029c07@xxxxxxxxxx/ It's resolved in the latest version of the patch in -mm.