On Sun, Mar 17, 2024 at 12:41:33AM +0000, Matthew Wilcox wrote: > On Sat, Mar 16, 2024 at 03:17:57PM -0400, Pasha Tatashin wrote: > > Expanding on Mathew's idea of an interface for dynamic kernel stack > > sizes, here's what I'm thinking: > > > > - Kernel Threads: Create all kernel threads with a fully populated > > THREAD_SIZE stack. (i.e. 16K) > > - User Threads: Create all user threads with THREAD_SIZE kernel stack > > but only the top page mapped. (i.e. 4K) > > - In enter_from_user_mode(): Expand the thread stack to 16K by mapping > > three additional pages from the per-CPU stack cache. This function is > > called early in kernel entry points. > > - exit_to_user_mode(): Unmap the extra three pages and return them to > > the per-CPU cache. This function is called late in the kernel exit > > path. > > > > Both of the above hooks are called with IRQ disabled on all kernel > > entries whether through interrupts and syscalls, and they are called > > early/late enough that 4K is enough to handle the rest of entry/exit. > > At what point do we replenish the per-CPU stash of pages? If we're > 12kB deep in the stack and call mutex_lock(), we can be scheduled out, > and then the new thread can make a syscall. Do we just assume that > get_free_page() can sleep at kernel entry (seems reasonable)? I don't > think this is an infeasible problem, I'd just like it to be described. schedule() or return to userspace, I believe was mentioned