On Tue, Mar 12, 2024 at 02:36:27PM -0700, H. Peter Anvin wrote: > On 3/12/24 12:45, Pasha Tatashin wrote: > > > > > > Ok, first of all, talking about "kernel memory" here is misleading. > > > > Hi Peter, > > > > I re-read my cover letter, and I do not see where "kernel memory" is > > mentioned. We are talking about kernel stacks overhead that is > > proportional to the user workload, as every active thread has an > > associated kernel stack. The idea is to save memory by not > > pre-allocating all pages of kernel-stacks, but instead use it as a > > safeguard when a stack actually becomes deep. Come-up with a solution > > that can handle rare deeper stacks only when needed. This could be > > done through faulting on the supported hardware (as proposed in this > > series), or via pre-map on every schedule event, and checking the > > access when thread goes off cpu (as proposed by Andy Lutomirski to > > avoid double faults on x86) . > > > > In other words, this feature is only about one very specific type of > > kernel memory that is not even directly mapped (the feature required > > vmapped stacks). > > > > > Unless your threads are spending nearly all their time sleeping, the > > > threads will occupy stack and TLS memory in user space as well. > > > > Can you please elaborate, what data is contained in the kernel stack > > when thread is in user space? My series requires thread_info not to be > > in the stack by depending on THREAD_INFO_IN_TASK. > > > > My point is that what matters is total memory use, not just memory used in > the kernel. Amdahl's law. If userspace is running a few processes with many threads and the userspace stacks are small, kernel stacks could end up dominating. I'd like to see some numbers though.