Hey Nathan, Could you post your boot timing patches? My machines are much smaller than yours, but I'm curious how things behave here as well. I did some very imprecise timings (strace -t on a telnet attached to the serial console). The 'struct page' initializations take about a minute of boot time for me to do 1TB across 8 NUMA nodes (this is a glueless QPI system[1]). My _quick_ calculations look like it's 2x as fast to initialize node0's memory vs. the other nodes, and boot time is increased by a second for about every 30G of memory we add. So even with nothing else fancy, we could get some serious improvements from just doing the initialization locally. [1] We call anything using pure QPI without any other circuitry for the NUMA interconnects to be "glueless" -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>