On 14.02.2014 [02:54:06 -0800], David Rientjes wrote: > On Thu, 13 Feb 2014, Nishanth Aravamudan wrote: > > > There is an open issue on powerpc with memoryless nodes (inasmuch as we > > can have them, but the kernel doesn't support it properly). There is a > > separate discussion going on on linuxppc-dev about what is necessary for > > CONFIG_HAVE_MEMORYLESS_NODES to be supported. > > > > Yeah, and this is causing problems with the slub allocator as well. > > > Apologies for hijacking the thread, my comments below were purely about > > the memoryless node support, not about readahead specifically. > > > > Neither you nor Raghavendra have any reason to apologize to anybody. > Memoryless node support on powerpc isn't working very well right now and > you're trying to fix it, that fix is needed both in this thread and in > your fixes for slub. It's great to see both of you working hard on your > platform to make it work the best. > > I think what you'll need to do in addition to your > CONFIG_HAVE_MEMORYLESS_NODE fix, which is obviously needed, is to enable > CONFIG_USE_PERCPU_NUMA_NODE_ID for the same NUMA configurations and then > use set_numa_node() or set_cpu_numa_node() to properly store the mapping > between cpu and node rather than numa_cpu_lookup_table. Then you should > be able to do away with your own implementation of cpu_to_node(). > > After that, I think it should be as simple as doing > > set_numa_node(cpu_to_node(cpu)); > set_numa_mem(local_memory_node(cpu_to_node(cpu))); > > probably before taking vector_lock in smp_callin(). The cpu-to-node > mapping should be done much earlier in boot while the nodes are being > initialized, I don't think there should be any problem there. vector_lock/smp_callin are ia64 specific things, I believe? I think the equivalent is just in start_secondary() for powerpc? (which in fact is what calls smp_callin on powerpc). Here is what I'm running into now: setup_arch -> do_init_bootmem -> cpu_numa_callback -> numa_setup_cpu -> map_cpu_to_node -> update_numa_cpu_lookup_table Which current updates the powerpc specific numa_cpu_lookup_table. I would like to update that function to use set_cpu_numa_node() and set_cpu_numa_mem(), but local_memory_node() is not yet functional because build_all_zonelists is called later in start_kernel. Would it make sense for first_zones_zonelist() to return NUMA_NO_NODE if we don't have a zone? Thanks, Nish -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>