On 05/30/2014 09:06 AM, Linus Torvalds wrote: > On Fri, May 30, 2014 at 8:52 AM, H. Peter Anvin <hpa@xxxxxxxxx> wrote: >>> That said, it's still likely a non-production option due to the page >>> table games we'd have to play at fork/clone time. >> >> Still, seems much more tractable. > > We might be able to make it more attractive by having a small > front-end cache of the 16kB allocations with the second page unmapped. > That would at least capture the common "lots of short-lived processes" > case without having to do kernel page table work. If we want to use 4k mappings, we'd need to move the stack over to using vmalloc() (or at least be out of the linear mapping) to avoid breaking up the linear map's page tables too much. Doing that, we'd actually not _have_ to worry about fragmentation, and we could actually utilize the per-cpu-pageset code since we'd could be back to using order-0 pages. So it's at least not all a loss. Although, I do remember playing with 4k stacks back in the 32-bit days and not getting much of a win with it. We'd definitely that cache, if for no other reason than the vmalloc/vmap code as-is isn't super-scalable. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>