On Thu, 2014-01-09 at 09:50 -0500, Vivek Goyal wrote: > On Wed, Jan 08, 2014 at 05:11:48PM -0700, Toshi Kani wrote: > > On Thu, 2014-01-09 at 00:07 +0100, Rafael J. Wysocki wrote: > > > On Wednesday, January 08, 2014 10:58:29 AM Vivek Goyal wrote: > > > > On Wed, Jan 08, 2014 at 11:26:43PM +0800, Baoquan wrote: > > > > > > > > [..] > > > > > [ 1.592222] acpi PNP0A03:03: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge. > > > > > [ 1.605045] PCI host bridge to bus 0000:ff > > > > > [ 1.609615] pci_bus 0000:ff: root bus resource [bus ff] > > > > > [ 1.632117] System RAM resource [mem 0x01000000-0x7bffffff] cannot be added > > > > > [ 1.639892] init_memory_mapping: [mem 0x100000000-0x87fffffff] > > > > > [ 1.717793] swapper/0: page allocation failure: order:9, mode:0x84d0 > > > > > [ 1.724884] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-59.el7.x86_64 #1 > > > > > [ 1.732842] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S001.032520101647 03/25/2010 > > > > > [ 1.743224] 0000000000000000 ffff8800339878c8 ffffffff815b64ad ffff880033987950 > > > > > [ 1.751513] ffffffff8113a980 ffff88003673ab28 00000000000001fe 0000000000000001 > > > > > [ 1.759804] ffff880000000040 ffffffff810bc28a 0000000000000000 0000000000000200 > > > > > [ 1.768096] Call Trace: [348/1928] > > > > > [ 1.770834] [<ffffffff815b64ad>] dump_stack+0x19/0x1b > > > > > [ 1.776561] [<ffffffff8113a980>] warn_alloc_failed+0xf0/0x160 > > > > > [ 1.783076] [<ffffffff810bc28a>] ? on_each_cpu_mask+0x2a/0x60 > > > > > [ 1.789581] [<ffffffff8113e92f>] __alloc_pages_nodemask+0x7ff/0xa00 > > > > > [ 1.796672] [<ffffffff815ada2c>] vmemmap_alloc_block+0x62/0xba > > > > > [ 1.803274] [<ffffffff815ada99>] vmemmap_alloc_block_buf+0x15/0x3b > > > > > [ 1.810263] [<ffffffff815ab8a6>] vmemmap_populate+0xb4/0x21b > > > > > [ 1.816673] [<ffffffff815adecd>] sparse_mem_map_populate+0x27/0x35 > > > > > [ 1.823665] [<ffffffff815ad8bf>] sparse_add_one_section+0x7a/0x185 > > > > > [ 1.830659] [<ffffffff8159b74f>] __add_pages+0xaf/0x240 > > > > > [ 1.836588] [<ffffffff81047359>] arch_add_memory+0x59/0xd0 > > > > > [ 1.842804] [<ffffffff8159ba89>] add_memory+0xb9/0x1b0 > > > > > [ 1.848638] [<ffffffff8132dd2c>] acpi_memory_device_add+0x18d/0x26d > > > > > [ 1.855728] [<ffffffff81303b91>] acpi_bus_device_attach+0x7d/0xcd > > > > > [ 1.862625] [<ffffffff8131d92d>] acpi_ns_walk_namespace+0xc8/0x17f > > > > > [ 1.869616] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > > > [ 1.876896] [<ffffffff81303b14>] ? acpi_bus_type_and_status+0x90/0x90 > > > > > [ 1.884177] [<ffffffff8131de1c>] acpi_walk_namespace+0x95/0xc5 > > > > > [ 1.890780] [<ffffffff81304866>] acpi_bus_scan+0x8b/0x9d > > > > > [ 1.896805] [<ffffffff81a14a15>] acpi_scan_init+0x63/0x160 > > > > > [ 1.903021] [<ffffffff81a14830>] acpi_init+0x25d/0x2a6 > > > > > > > > So basically acpi thinks that some memory block is a hot plug memory > > > > and tries to add it. And that consumes lots of memory and we don't have > > > > that memory in second kernel. > > > > > > That's not exactly the case. What seems to happen is that there is an ACPI > > > memory object in the ACPI namespace and the ACPI memory hotplug driver > > > attempts to bind to it. That driver attempts to find removable memory blocks > > > associated with that object and to add them to the memory map. > > > > > > Why don't you simply append acpi=off to the kexec command line? That should > > > make the problem go away. > > > > Yes, that should work, but Baoquan's approach makes sense to me. When > > memmap=exactmap is specified, the kernel should ignore any memory > > information from the firmware. > > memmap=exactmap is only for E820 map. It does not say that later memory > can not be hotplugged. So to me specifying exactmap does not imply that > memory hotplugging is disabled. There are multiple ways to describe memory range info in the firmware; e820, EFI memory descriptor table, and ACPI memory device objects. They basically provide the same info. This problem happens when the firmware implements ACPI memory device objects, which are necessary to support memory hotplug, but do not mean that the system always supports hotplug when they exist. They are optional objects that firmware vendors may choose to implement. While the exactmap option does not imply that memory hotplug is disabled, it does require that the kernel only consumes user-supplied memory range information. Hence, Baoquan's approach makes sense to me. > IMO, it makes sense to have a separate knob to disable memory hotplug > behavior. Regular users do not know if their systems implement ACPI memory device objects or not. So, asking users to specify a separate option when their systems implement ACPI memory objects is tricky, IMO. > Also from kdump point of view, I don't want to rely on exactmap as in > new implementation I am planning to move away from exactmap. I will > pass new memory map in bootparams and stop passing it on command line. I think we still need a flag that indicates the kernel can only consume the new memory map in bootparams, and cannot to obtain from the firmware. Thanks, -Toshi