On Thu, Apr 05, 2018 at 09:49:40AM -0400, Pavel Tatashin wrote: >> Hi Sasha, >> >> I have registered on Azure's portal, and created a VM with 4 CPUs and 16G >> of RAM. However, I still was not able to reproduce the boot bug you found. > >I have also tried to reproduce this issue on Windows 10 + Hyper-V, still >unsuccessful. I'm not sure why you can't reproduce it. I built a 4.16 kernel + your 6 patches on top, and booting on a D64s_v3 instance gives me this: [ 1.205726] page:ffffea0084000000 is uninitialized and poisoned [ 1.205737] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff [ 1.207016] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff [ 1.208014] page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) [ 1.209087] ------------[ cut here ]------------ [ 1.210000] kernel BUG at ./include/linux/mm.h:901! [ 1.210015] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 1.211000] Modules linked in: [ 1.211000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.16.0+ #10 [ 1.211000] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 [ 1.211000] RIP: 0010:get_nid_for_pfn+0x6e/0xa0 [ 1.211000] RSP: 0000:ffff881c63cbfc28 EFLAGS: 00010246 [ 1.211000] RAX: 0000000000000000 RBX: ffffea0084000000 RCX: 0000000000000000 [ 1.211000] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffed038c797f78 [ 1.211000] RBP: ffff881c63cbfc30 R08: ffff88401174a480 R09: 0000000000000000 [ 1.211000] R10: ffff8840e00d6040 R11: 0000000000000000 R12: 0000000002107fff [ 1.211000] R13: fffffbfff4648234 R14: 0000000000000001 R15: 0000000000000001 [ 1.211000] FS: 0000000000000000(0000) GS:ffff881c6aa00000(0000) knlGS:0000000000000000 [ 1.211000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.211000] CR2: 0000000000000000 CR3: 0000002814216000 CR4: 00000000003406f0 [ 1.211000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1.211000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1.211000] Call Trace: [ 1.211000] register_mem_sect_under_node+0x1a2/0x530 [ 1.211000] link_mem_sections+0x12d/0x200 [ 1.211000] topology_init+0xe6/0x178 [ 1.211000] ? enable_cpu0_hotplug+0x1a/0x1a [ 1.211000] do_one_initcall+0xb0/0x31f [ 1.211000] ? initcall_blacklisted+0x220/0x220 [ 1.211000] ? up_write+0x78/0x140 [ 1.211000] ? up_read+0x40/0x40 [ 1.211000] ? __asan_register_globals+0x30/0xa0 [ 1.211000] ? kasan_unpoison_shadow+0x35/0x50 [ 1.211000] kernel_init_freeable+0x69d/0x764 [ 1.211000] ? start_kernel+0x8fd/0x8fd [ 1.211000] ? finish_task_switch+0x1b6/0x9c0 [ 1.211000] ? rest_init+0x120/0x120 [ 1.211000] kernel_init+0x13/0x150 [ 1.211000] ? rest_init+0x120/0x120 [ 1.211000] ret_from_fork+0x3a/0x50 [ 1.211000] Code: ff df 48 c1 ea 03 80 3c 02 00 75 34 48 8b 03 48 83 f8 ff 74 07 48 c1 e8 36 5b 5d c3 48 c7 c6 00 ca f5 9e 48 89 df e8 82 13 d5 fd <0f> 0b 48 c7 c7 00 24 2e a1 e8 05 36 c1 fe e8 af 07 ea fd eb ac [ 1.211000] RIP: get_nid_for_pfn+0x6e/0xa0 RSP: ffff881c63cbfc28 [ 1.211017] ---[ end trace d86a03841f7ef229 ]--- [ 1.212020] ================================================================== [ 1.213000] BUG: KASAN: stack-out-of-bounds in update_stack_state+0x64c/0x810 [ 1.213000] Read of size 8 at addr ffff881c63cbfaf8 by task swapper/0/1 [ 1.213000] [ 1.213000] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G D 4.16.0+ #10 [ 1.213000] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 [ 1.213000] Call Trace: [ 1.213000] dump_stack+0xe3/0x196 [ 1.213000] ? _atomic_dec_and_lock+0x31a/0x31a [ 1.213000] ? vprintk_func+0x27/0x60 [ 1.213000] ? printk+0x9c/0xc3 [ 1.213000] ? show_regs_print_info+0x10/0x10 [ 1.213000] ? lock_acquire+0x760/0x760 [ 1.213000] ? update_stack_state+0x64c/0x810 [ 1.213000] print_address_description+0xe4/0x480 [ 1.213000] ? update_stack_state+0x64c/0x810 [ 1.213000] kasan_report+0x1d7/0x460 [ 1.213000] ? console_unlock+0x652/0xe90 [ 1.213000] ? update_stack_state+0x64c/0x810 [ 1.213000] __asan_report_load8_noabort+0x19/0x20 [ 1.213000] update_stack_state+0x64c/0x810 [ 1.213000] ? __read_once_size_nocheck.constprop.2+0x50/0x50 [ 1.213000] ? put_files_struct+0x2a4/0x390 [ 1.213000] ? unwind_next_frame+0x202/0x1230 [ 1.213000] unwind_next_frame+0x202/0x1230 [ 1.213000] ? unwind_dump+0x590/0x590 [ 1.213000] ? get_stack_info+0x42/0x3b0 [ 1.213000] ? debug_check_no_locks_freed+0x300/0x300 [ 1.213000] ? __unwind_start+0x170/0x380 [ 1.213000] __save_stack_trace+0x82/0x140 [ 1.213000] ? put_files_struct+0x2a4/0x390 [ 1.213000] save_stack_trace+0x39/0x70 [ 1.213000] save_stack+0x43/0xd0 [ 1.213000] ? save_stack+0x43/0xd0 [ 1.213000] ? __kasan_slab_free+0x11f/0x170 [ 1.213000] ? kasan_slab_free+0xe/0x10 [ 1.213000] ? kmem_cache_free+0xe6/0x560 [ 1.213000] ? put_files_struct+0x2a4/0x390 [ 1.213000] ? _get_random_bytes+0x162/0x5a0 [ 1.213000] ? trace_hardirqs_off+0xd/0x10 [ 1.213000] ? lock_acquire+0x212/0x760 [ 1.213000] ? rcuwait_wake_up+0x15e/0x2c0 [ 1.213000] ? lock_acquire+0x212/0x760 [ 1.213000] ? free_obj_work+0x8a0/0x8a0 [ 1.213000] ? lock_acquire+0x212/0x760 [ 1.213000] ? acct_collect+0x776/0xe80 [ 1.213000] ? acct_collect+0x2e4/0xe80 [ 1.213000] ? acct_collect+0x2e4/0xe80 [ 1.213000] ? lock_acquire+0x760/0x760 [ 1.213000] ? lock_downgrade+0x910/0x910 [ 1.213000] __kasan_slab_free+0x11f/0x170 [ 1.213000] ? put_files_struct+0x2a4/0x390 [ 1.213000] kasan_slab_free+0xe/0x10 [ 1.213000] kmem_cache_free+0xe6/0x560 [ 1.213000] put_files_struct+0x2a4/0x390 [ 1.213000] ? get_files_struct+0x80/0x80 [ 1.213000] ? do_raw_spin_trylock+0x1f0/0x1f0 [ 1.213000] exit_files+0x83/0xc0 [ 1.213000] do_exit+0x9be/0x2190 [ 1.213000] ? do_invalid_op+0x20/0x30 [ 1.213000] ? mm_update_next_owner+0x1200/0x1200 [ 1.213000] ? get_nid_for_pfn+0x6e/0xa0 [ 1.213000] ? get_nid_for_pfn+0x6e/0xa0 [ 1.213000] ? register_mem_sect_under_node+0x1a2/0x530 [ 1.213000] ? link_mem_sections+0x12d/0x200 [ 1.213000] ? topology_init+0xe6/0x178 [ 1.213000] ? enable_cpu0_hotplug+0x1a/0x1a [ 1.213000] ? do_one_initcall+0xb0/0x31f [ 1.213000] ? initcall_blacklisted+0x220/0x220 [ 1.213000] ? up_write+0x78/0x140 [ 1.213000] ? up_read+0x40/0x40 [ 1.213000] ? __asan_register_globals+0x30/0xa0 [ 1.213000] ? kasan_unpoison_shadow+0x35/0x50 [ 1.213000] ? kernel_init_freeable+0x69d/0x764 [ 1.213000] ? start_kernel+0x8fd/0x8fd [ 1.213000] ? finish_task_switch+0x1b6/0x9c0 [ 1.213000] ? rest_init+0x120/0x120 [ 1.213000] rewind_stack_do_exit+0x17/0x20 [ 1.213000] [ 1.213000] The buggy address belongs to the page: [ 1.213000] page:ffffea00718f2fc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0 [ 1.213000] flags: 0x17ffffc0000000() [ 1.213000] raw: 0017ffffc0000000 0000000000000000 0000000000000000 00000000ffffffff [ 1.213000] raw: ffffea00718f2fe0 ffffea00718f2fe0 0000000000000000 0000000000000000 [ 1.213000] page dumped because: kasan: bad access detected [ 1.213000] [ 1.213000] Memory state around the buggy address: [ 1.213000] ffff881c63cbf980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 1.213000] ffff881c63cbfa00: 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 [ 1.213000] >ffff881c63cbfa80: f1 f8 f2 f2 f2 00 00 00 00 00 00 00 00 00 f3 f3 [ 1.213000] ^ [ 1.213000] ffff881c63cbfb00: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 1.213000] ffff881c63cbfb80: f1 f1 f1 f1 00 f2 f2 f2 f2 f2 f2 f2 00 f2 f2 f2 [ 1.213000] ================================================================== [ 1.213033] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 1.213033] [ 1.214000] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b