On Mon, May 4, 2020 at 6:52 PM Will Deacon <will@xxxxxxxxxx> wrote: > On Mon, Apr 27, 2020 at 01:45:46PM -0700, Sami Tolvanen wrote: > > On Fri, Apr 24, 2020 at 12:21:14PM +0100, Will Deacon wrote: > > > Also, since you mentioned the lack of redzoning, isn't it a bit dodgy > > > allocating blindly out of the kmem_cache? It means we don't have a redzone > > > or a guard page, so if you can trigger something like a recursion bug then > > > could you scribble past the SCS before the main stack overflows? Would this > > > clobber somebody else's SCS? > > > > I agree that allocating from a kmem_cache isn't ideal for safety. It's a > > compromise to reduce memory overhead. > > Do you think it would be a problem if we always allocated a page for the > SCS? I guess doing this safely and without wasting a page per task would only be possible in an elegant way once MTE lands on devices? I wonder how bad context switch latency would be if the actual SCS was percpu and vmapped (starting at an offset inside the page such that the SCS can only grow up to something like 0x400 bytes before panicking the CPU) and the context switch path saved/restored the used part of the vmapped SCS into a smaller allocation from the slab allocator... presumably the SCS will usually just be something like one cacheline big? That probably only costs a moderate amount of time to copy... Or as an extension of that, if the SCS copying turns out to be too costly, there could be a percpu LRU cache consisting of vmapped SCS pages, and whenever a task gets scheduled that doesn't have a vmapped SCS, it "swaps out" the contents of the least recently used vmapped SCS into the corresponding task's slab SCS, and "swaps in" from its own slab SCS into the vmapped SCS. And task migration would force "swapping out". Not sure if this is a good idea, or if I'm just making things worse by suggesting extra complexity...