On Mon 24-07-17 09:57:14, Tejun Heo wrote: > Hello, Hi, and thanks for ths swift answer > On Mon, Jul 24, 2017 at 03:42:40PM +0200, Michal Hocko wrote: [...] > > My understanding of the pcpu allocator is basically close to zero but it > > seems weird to me that we would need many TB of vmalloc address space > > just to allocate vmalloc areas that are in range of hundreds of MB. So I > > am wondering whether this is an expected behavior of the allocator or > > there is a problem somwehere else. > > It's not actually using the entire region but the area allocations try > to follow the same topology as kernel linear address layouts. ie. if > kernel address for different NUMA nodes are apart by certain amount, > the percpu allocator tries to replicate that for dynamic allocations > which allows leaving the static and first dynamic area in the kernel > linear address which helps reducing TLB pressure. > > This optimization can be turned off when vmalloc area isn't spacious > enough by using pcpu_page_first_chunk() instead of > pcpu_embed_first_chunk() while initializing percpu allocator. Thanks for the clarification, this is really helpful! > Can you > see whether replacing that in arch/powerpc/kernel/setup_64.c fixes the > issue? If so, all it needs to do is figuring out what conditions we > need to check to opt out of embedding the first chunk. Note that x86 > 32bit does about the same thing. Hmm, I will need some help from PPC guys here. I cannot find something ready to implement pcpup_populate_pte and I am not familiar with ppc memory model to implement one myself. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>