On 07/09/2020 11:41 AM, kernel test robot wrote: > [ 94.349598] BUG: unable to handle page fault for address: ffffed10a7ffddff > [ 94.351039] #PF: supervisor read access in kernel mode > [ 94.352172] #PF: error_code(0x0000) - not-present page > [ 94.353256] PGD 43ffed067 P4D 43ffed067 PUD 43fdee067 PMD 0 > [ 94.354484] Oops: 0000 [#1] SMP KASAN > [ 94.355238] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc4-00002-ga97a17109332c #1 > [ 94.360456] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 > [ 94.361950] RIP: 0010:hugetlb_advanced_tests+0x137/0x699 > [ 94.363026] Code: 8b 13 4d 85 f6 75 0b 48 ff 05 2c e4 6a 01 31 ed eb 41 bf f8 ff ff ff ba ff ff 37 00 4c 01 f7 48 c1 e2 2a 48 89 f9 48 c1 e9 03 <80> 3c 11 00 74 05 e8 cd c0 67 fa ba f8 ff ff ff 49 8b 2c 16 48 85 > [ 94.366592] RSP: 0000:ffffc90000047d30 EFLAGS: 00010a06 > [ 94.367693] RAX: 1ffffffff1049b80 RBX: ffff888380525308 RCX: 1ffff110a7ffddff > [ 94.369215] RDX: dffffc0000000000 RSI: 1ffff11087ffdc00 RDI: ffff88853ffeeff8 > [ 94.370693] RBP: 000000000018e510 R08: 0000000000000025 R09: 0000000000000001 > [ 94.372165] R10: ffff888380523c07 R11: ffffed10700a4780 R12: ffff88843208e510 > [ 94.373674] R13: 0000000000000025 R14: ffff88843ffef000 R15: 000031e01ae61000 > [ 94.375147] FS: 0000000000000000(0000) GS:ffff8883a3800000(0000) knlGS:0000000000000000 > [ 94.376883] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 94.378051] CR2: ffffed10a7ffddff CR3: 0000000004e15000 CR4: 00000000000406a0 > [ 94.379522] Call Trace: > [ 94.380073] debug_vm_pgtable+0xd81/0x2029 > [ 94.380871] ? pmd_advanced_tests+0x621/0x621 > [ 94.381819] do_one_initcall+0x1eb/0xbd0 > [ 94.382551] ? trace_event_raw_event_initcall_finish+0x240/0x240 > [ 94.383634] ? rcu_read_lock_sched_held+0xb9/0x110 > [ 94.388727] ? rcu_read_lock_held+0xd0/0xd0 > [ 94.389604] ? __kasan_check_read+0x1d/0x30 > [ 94.390485] kernel_init_freeable+0x430/0x4f8 > [ 94.391416] ? rest_init+0x3f8/0x3f8 > [ 94.392185] kernel_init+0x14/0x1e8 > [ 94.392918] ret_from_fork+0x22/0x30 > [ 94.393662] Modules linked in: > [ 94.394289] CR2: ffffed10a7ffddff > [ 94.395000] ---[ end trace 8ca5a1655dfb8c39 ]--- This bug is caused from here. static inline struct mem_section *__nr_to_section(unsigned long nr) { #ifdef CONFIG_SPARSEMEM_EXTREME if (!mem_section) return NULL; #endif if (!mem_section[SECTION_NR_TO_ROOT(nr)]) <-------- BUG return NULL; return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK]; } static inline struct mem_section *__pfn_to_section(unsigned long pfn) { return __nr_to_section(pfn_to_section_nr(pfn)); } #define __pfn_to_page(pfn) \ ({ unsigned long __pfn = (pfn); \ struct mem_section *__sec = __pfn_to_section(__pfn); \ __section_mem_map_addr(__sec) + __pfn; \ }) which is called via hugetlb_advanced_tests(). paddr = (__pfn_to_phys(pfn) | RANDOM_ORVALUE) & PMD_MASK; pte = pte_mkhuge(mk_pte(pfn_to_page(PHYS_PFN(paddr)), prot)); Primary reason being RANDOM_ORVALUE, which is added to the paddr before being masked with PMD_MASK. This clobbers up the pfn value which cannot be searched in relevant memory sections. This problem stays hidden on other configs where pfn_to_page() does not go via memory section search. Dropping off RANDOM_ORVALUE solves the problem. Probably, just wanted to drop that off during V2 series (https://lkml.org/lkml/2020/4/8/997) but dont remember why ended up keeping it again.