On Fri, 4 May 2018 18:08:45 +0200 Michal Hocko <mhocko@xxxxxxxxxx> wrote: > On Fri 04-05-18 09:53:11, Jonathan Cameron wrote: > > The case of a new numa node got missed in avoiding using > > the node info from page_struct during hotplug. In this > > path we have a call to register_mem_sect_under_node (which allows > > us to specify it is hotplug so don't change the node), > > via link_mem_sections which unfortunately does not. > > I have hard time to parse the problem description. Could you be more > specific and describe the user visible effect along with steps to > trigger the issue? Hi Michal, Sure, the result is that (with a new memory only node) we never successfully call register_mem_sect_under_node so don't get the memory associated with the node in sysfs and meminfo for the node doesn't report it. It came up whilst testing some arm64 hotplug patches, but appears to be universal. Whilst I'm triggering it by removing then reinserting memory to a node with no other elements (thus making the node disappear then appear again), it appears it would happen on hotplugging memory where there was none before and it doesn't seem to be related the arm64 patches. These patches call __add_pages (where most of the issue was fixed by Pavel's patch). If there is a node at the time of the __add_pages call then all is well as it calls register_mem_sect_under_node from there with check_nid set to false. Without a node that function returns having not done the sysfs related stuff as there is no node to use. This is expected but it is the resulting path that fails... Exact path to the problem is as follows: mm/memory_hotplug.c : add_memory_resource The node is not online so we enter the if (new_node) twice, on the second such block there is a call to link_mem_sections which calls into drivers/node.c: link_mem_sections which calls drivers/node.c: register_mem_sect_under_node which calls get_nid_for_pfn and keeps trying until the output of that matches the expected node (passed all the way down from add_memory_resource) It is effectively the same fix as the one referred to in the fixes tag just in the code path for a new node where the comments point out we have to rerun the link creation because it will have failed in register_new_memory (as there was no node at the time). (actually that comment is wrong now as we don't have register_new_memory any more it got renamed to hotplug_memory_register in Pavel's patch). Jonathan