On Tue, Mar 13, 2018 at 08:38:57PM -0400, Pavel Tatashin wrote: >Hi Sasha, > >It seems the patch is doing the right thing, and it catches bugs. Here >we access uninitialized struct page. The question is why this happens? Not completely; note that we die on an invalid reference rather than assertion failure. >register_mem_sect_under_node(struct memory_block *mem_blk, int nid) > page_nid = get_nid_for_pfn(pfn); > >node id is stored in page flags, and since struct page is poisoned, >and the pattern is recognized, the panic is triggered. > >Do you have config file? Also, instructions how to reproduce it? Attached the config. It just happens on boot. >Thank you, >Pasha > > >On Tue, Mar 13, 2018 at 7:43 PM, Sasha Levin ><Alexander.Levin@xxxxxxxxxxxxx> wrote: >> On Wed, Jan 31, 2018 at 04:02:59PM -0500, Pavel Tatashin wrote: >>>During boot we poison struct page memory in order to ensure that no one is >>>accessing this memory until the struct pages are initialized in >>>__init_single_page(). >>> >>>This patch adds more scrutiny to this checking, by making sure that flags >>>do not equal to poison pattern when the are accessed. The pattern is all >>>ones. >>> >>>Since, node id is also stored in struct page, and may be accessed quiet >>>early we add the enforcement into page_to_nid() function as well. >>> >>>Signed-off-by: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx> >>>--- >> >> Hey Pasha, >> >> This patch is causing the following on boot: >> >> [ 1.253732] BUG: unable to handle kernel paging request at fffffffffffffffe >> [ 1.254000] PGD 2284e19067 P4D 2284e19067 PUD 2284e1b067 PMD 0 >> [ 1.254000] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI >> [ 1.254000] Modules linked in: >> [ 1.254000] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc5-next-20180313 #10 >> [ 1.254000] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 >> [ 1.254000] RIP: 0010:__dump_page (??:?) >> [ 1.254000] RSP: 0000:ffff881c63c17810 EFLAGS: 00010246 >> [ 1.254000] RAX: dffffc0000000000 RBX: ffffea0084000000 RCX: 1ffff1038c782f2b >> [ 1.254000] RDX: 1fffffffffffffff RSI: ffffffff9e160640 RDI: ffffea0084000000 >> [ 1.254000] RBP: ffff881c63c17c00 R08: ffff8840107e8880 R09: ffffed0802167a4d >> [ 1.254000] R10: 0000000000000001 R11: ffffed0802167a4c R12: 1ffff1038c782f07 >> [ 1.254000] R13: ffffea0084000020 R14: fffffffffffffffe R15: ffff881c63c17bd8 >> [ 1.254000] FS: 0000000000000000(0000) GS:ffff881c6ac00000(0000) knlGS:0000000000000000 >> [ 1.254000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 1.254000] CR2: fffffffffffffffe CR3: 0000002284e16000 CR4: 00000000003406e0 >> [ 1.254000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> [ 1.254000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> [ 1.254000] Call Trace: >> [ 1.254000] dump_page (/mm/debug.c:80) >> [ 1.254000] get_nid_for_pfn (/./include/linux/mm.h:900 /drivers/base/node.c:396) >> [ 1.254000] register_mem_sect_under_node (/drivers/base/node.c:438) >> [ 1.254000] link_mem_sections (/drivers/base/node.c:517) >> [ 1.254000] topology_init (/./include/linux/nodemask.h:271 /arch/x86/kernel/topology.c:164) >> [ 1.254000] do_one_initcall (/init/main.c:835) >> [ 1.254000] kernel_init_freeable (/init/main.c:901 /init/main.c:909 /init/main.c:927 /init/main.c:1076) >> [ 1.254000] kernel_init (/init/main.c:1004) >> [ 1.254000] ret_from_fork (/arch/x86/entry/entry_64.S:417) >> [ 1.254000] Code: ff a8 01 4c 0f 44 f3 4d 85 f6 0f 84 31 0e 00 00 4c 89 f2 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 2d 11 00 00 <49> 83 3e ff 0f 84 a9 06 00 00 4d 8d b7 c0 fd ff ff 48 b8 00 00 >> All code >> ======== >> 0: ff a8 01 4c 0f 44 ljmp *0x440f4c01(%rax) >> 6: f3 4d 85 f6 repz test %r14,%r14 >> a: 0f 84 31 0e 00 00 je 0xe41 >> 10: 4c 89 f2 mov %r14,%rdx >> 13: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax >> 1a: fc ff df >> 1d: 48 c1 ea 03 shr $0x3,%rdx >> 21: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) >> 25: 0f 85 2d 11 00 00 jne 0x1158 >> 2b:* 49 83 3e ff cmpq $0xffffffffffffffff,(%r14) <-- trapping instruction >> 2f: 0f 84 a9 06 00 00 je 0x6de >> 35: 4d 8d b7 c0 fd ff ff lea -0x240(%r15),%r14 >> 3c: 48 rex.W >> 3d: b8 .byte 0xb8 >> ... >> >> Code starting with the faulting instruction >> =========================================== >> 0: 49 83 3e ff cmpq $0xffffffffffffffff,(%r14) >> 4: 0f 84 a9 06 00 00 je 0x6b3 >> a: 4d 8d b7 c0 fd ff ff lea -0x240(%r15),%r14 >> 11: 48 rex.W >> 12: b8 .byte 0xb8 >> ... >> [ 1.254000] RIP: __dump_page+0x1c8/0x13c0 RSP: ffff881c63c17810 (/./include/asm-generic/sections.h:42) >> [ 1.254000] CR2: fffffffffffffffe >> [ 1.254000] ---[ end trace e643dfbc44b562ca ]--- >> >> -- >> >> Thanks, >> Sasha -- Thanks, Sasha
Attachment:
config-sasha.gz
Description: config-sasha.gz