On 12/11/18 1:47 AM, Jonathan Cameron wrote: > When the PCI code later comes along and calls acpi_get_node() for any PCI > card below the root port, it navigates up the ACPI tree until it finds the > _PXM value in the root port. This value is then passed to > acpi_map_pxm_to_node(). > > As numa_off has not been set on x86 it tries to allocate a NUMA node, from > the unused set, without setting up all the infrastructure that would > normally accompany such a call. FWIW, this _sounds_ like the real problem here. We're allowing an allocation to proceed without some infrastructure that we require. Shouldn't we be detecting that this infrastructure is not in place and warn about *it* at least? I'm a bit worried that this is just papering over an unknown error to make a hang go away. It seems a bit too far away from the root cause.