On 29/11/16 08:10, Tejun Heo wrote: > On Thu, Nov 24, 2016 at 12:05:12AM +1100, Balbir Singh wrote: >> On my desktop NODES_SHIFT is 6, many distro kernels have it a 9. I've known >> of solutions that use fake NUMA for partitioning and need as many nodes as >> possible. > > It was a crude kludge that people used before memcg. If people still > use it, that's fine but we don't want to optimize / make code > complicated for it, so let's please put away this part of > justification. Are you suggesting those use cases can be ignored now? > > It's understandable that some kernels want to have large NODES_SHIFT > to support wide range of configurations but if that makes wastage too > high, the simpler solution is updating the users to use the rumtime > detected possible number / mask instead of the compile time > NODES_SHIFT. Note that we do exactly the same thing for per-cpu > things - we configure high max but do all operations on what's > possible on the system. > > NUMA code already has possible detection. Why not simply make memcg > use those instead of MAX_NUMNODES like how we use nr_cpu_ids instead > of NR_CPUS? > nodes_possible_map is set to node_online_map at the moment for ppc64. Which becomes a problem when hotplugging a node that was not already online. I am not sure what you mean by possible detection. node_possible_map is set based on CONFIG_NODE_SHIFT and then can be adjusted by the architecture (if desired). Are you suggesting firmware populate it in? Thanks, Balbir Singh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>