On 14 July 2018 at 08:20, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > On Fri, Jul 13, 2018 at 4:51 PM Linus Torvalds > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >> >> I'm building a "replace VM_BUG_ON() with proper printk's instead" right now. > > Ok, the machine now stays up, and I get messages like > > Removed VM_BUG_ON()! > pfn c2400 - c25ff > zone DMA32 DMA > zone pfn 1000 1 > > Removed VM_BUG_ON()! > pfn c0a00 - c0bff > zone DMA32 DMA > zone pfn 1000 1 > > Removed VM_BUG_ON()! > pfn c2200 - c23ff > zone DMA DMA32 > zone pfn 1 1000 > > instead. > > That's from > > + printk("Removed VM_BUG_ON()!\n"); > + printk(" pfn %lx - %lx\n", page_to_pfn(start_page), > page_to_pfn(end_page)); > + printk(" zone %s %s\n", page_zone(start_page)->name, > page_zone(end_page)->name); > + printk(" zone pfn %lx %lx\n", > page_zone(start_page)->zone_start_pfn, > page_zone(end_page)->zone_start_pfn); > > inside an if() statement that replaced that VM_BUG_ON(). > > WTF? That's just odd. > > But everything seems to work fine, and now it doesn't crash. > > But there's something really odd going on wrt page_zone() and/or page_to_pfn(). > > page_to_pfn() implies this is just regular memory in the 3GB area. It > is likely related to this: > > BIOS-e820: [mem 0x00000000c0b33000-0x00000000c226cfff] reserved > BIOS-e820: [mem 0x00000000c226d000-0x00000000c227efff] ACPI data > BIOS-e820: [mem 0x00000000c227f000-0x00000000c2439fff] usable > BIOS-e820: [mem 0x00000000c243a000-0x00000000c2a61fff] ACPI NVS > BIOS-e820: [mem 0x00000000c2a62000-0x00000000c32fefff] reserved > BIOS-e820: [mem 0x00000000c32ff000-0x00000000c32fffff] usable > BIOS-e820: [mem 0x00000000c3300000-0x00000000c7ffffff] reserved > > I dunno. It's a bit odd. I'm not sure I understand that VM_BUG_ON(). > Adding Ard (who worked on the memblock_next_valid_pfn() thing not that > long ago) and must have hit this same BUG_ON() because he modified it > not that long ago. > > Ard, I triggered the VM_BUG_ON() in mm/page_alloc.c:2016, with a call trace opf > > RIP: move_pfreepages_block() > Call Trace: > steal_suitable_fallback > get_page_from_freelist > ... > > just for some context. > Pavel's fix for this issue in commit e181ae0c5db9 is causing boot problems on i686 for me. Is anyone else seeing the same? I get no output whatsoever when booting a i386_defconfig kernel under qemu/kvm (without EFI)