On Wed, Mar 30, 2022 at 07:21:23PM +0530, Srikar Dronamraju wrote: > With commit 09f49dca570a ("mm: handle uninitialized numa nodes > gracefully") NODE_DATA for even a memoryless/cpuless node is partially > initialized at boot time. > > Before onlining the node, current Powerpc code checks for NODE_DATA to > be NULL. However since NODE_DATA is partially initialized, this check > will end up always being false. > > This causes hotplugging a CPU to a memoryless/cpuless node to fail. > > Before adding CPUs > $ numactl -H > available: 1 nodes (4) > node 4 cpus: 0 1 2 3 4 5 6 7 > node 4 size: 97372 MB > node 4 free: 95545 MB > node distances: > node 4 > 4: 10 > > $ lparstat > System Configuration > type=Dedicated mode=Capped smt=8 lcpu=1 mem=99709440 kB cpus=0 ent=1.00 > > %user %sys %wait %idle physc %entc lbusy app vcsw phint > ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- > 2.66 2.67 0.16 94.51 0.00 0.00 5.33 0.00 67749 0 > > After hotplugging 32 cores > $ numactl -H > node 4 cpus: 0 1 2 3 4 5 6 7 120 121 122 123 124 125 126 127 128 129 130 > 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 > 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 > 167 168 169 170 171 172 173 174 175 > node 4 size: 97372 MB > node 4 free: 93636 MB > node distances: > node 4 > 4: 10 > > $ lparstat > System Configuration > type=Dedicated mode=Capped smt=8 lcpu=33 mem=99709440 kB cpus=0 ent=33.00 > > %user %sys %wait %idle physc %entc lbusy app vcsw phint > ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- > 0.04 0.02 0.00 99.94 0.00 0.00 0.06 0.00 1128751 3 > > As we can see numactl is listing only 8 cores while lparstat is showing > 33 cores. > > Also dmesg is showing messages like: > [ 2261.318350 ] BUG: arch topology borken > [ 2261.318357 ] the DIE domain not a subset of the NODE domain > > Fixes: 09f49dca570a ("mm: handle uninitialized numa nodes gracefully") > Cc: linuxppc-dev@xxxxxxxxxxxxxxxx > Cc: linux-mm@xxxxxxxxx > Cc: Michal Hocko <mhocko@xxxxxxxxxx> > Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx> > Reported-by: Geetika Moolchandani <Geetika.Moolchandani1@xxxxxxx> > Signed-off-by: Srikar Dronamraju <srikar@xxxxxxxxxxxxxxxxxx> Acked-by: Mike Rapoport <rppt@xxxxxxxxxxxxx> > --- > arch/powerpc/mm/numa.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c > index b9b7fefbb64b..13022d734951 100644 > --- a/arch/powerpc/mm/numa.c > +++ b/arch/powerpc/mm/numa.c > @@ -1436,7 +1436,7 @@ int find_and_online_cpu_nid(int cpu) > if (new_nid < 0 || !node_possible(new_nid)) > new_nid = first_online_node; > > - if (NODE_DATA(new_nid) == NULL) { > + if (!node_online(new_nid)) { > #ifdef CONFIG_MEMORY_HOTPLUG > /* > * Need to ensure that NODE_DATA is initialized for a node from > -- > 2.27.0 > > -- Sincerely yours, Mike.