On Mon, Jun 01, 2020 at 09:55:38AM +0100, Dave P Martin wrote: > On Thu, May 28, 2020 at 05:34:13PM +0100, Catalin Marinas wrote: > > On Thu, May 28, 2020 at 12:05:09PM +0100, Szabolcs Nagy wrote: > > > The 05/28/2020 10:14, Catalin Marinas wrote: > > > > On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote: > > > > > Should the userspace stack always be mapped as if with PROT_MTE if the > > > > > hardware supports it? Such a change would be invisible to non-MTE > > > > > aware userspace since it would already need to opt in to tag checking > > > > > via prctl. This would let userspace avoid a complex stack > > > > > initialization sequence when running with stack tagging enabled on the > > > > > main thread. > > > > > > > > I don't think the stack initialisation is that difficult. On program > > > > startup (can be the dynamic loader). Something like (untested): > > > > > > > > register unsigned long stack asm ("sp"); > > > > unsigned long page_sz = sysconf(_SC_PAGESIZE); > > > > > > > > mprotect((void *)(stack & ~(page_sz - 1)), page_sz, > > > > PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN); > > > > > > > > (the essential part it PROT_GROWSDOWN so that you don't have to specify > > > > a stack lower limit) > > > > > > does this work even if the currently mapped stack is more than page_sz? > > > determining the mapped main stack area is i think non-trivial to do in > > > userspace (requires parsing /proc/self/maps or similar). > > > > Because of PROT_GROWSDOWN, the kernel adjusts the start of the range > > down automatically. It is potentially problematic if the top of the > > stack is more than a page away and you want the whole stack coloured. I > > haven't run a test but my reading of the kernel code is that the stack > > vma would be split in this scenario, so the range beyond sp+page_sz > > won't have PROT_MTE set. > > > > My assumption is that if you do this during program start, the stack is > > smaller than a page. Alternatively, could we use argv or envp to > > determine the top of the user stack (the bottom is taken care of by the > > kernel)? > > I don't think you can easily know when the stack ends, but perhaps it > doesn't matter. > > From memory, the initial stack looks like: > > argv/env strings > AT_NULL > auxv > NULL > env > NULL > argv > argc <--- sp > > If we don't care about tagging the strings correctly, we could step to > the end of auxv and tag down from there. > > If we do care about tagging the strings, there's probably no good way > to find the end of the string area, other than looking up sp in > /proc/self/maps. I'm not sure we should trust all past and future > kernels to spit out the strings in a predictable order. I don't think we care about tagging whatever the kernel places on the stack since the argv/envp pointers are untagged. An mprotect(PROT_MTE) may or may not cover the environment but it shouldn't matter as the kernel clears the tags on the corresponding pages anyway. AFAIK stack tagging works by colouring a stack frame on function entry and clearing the tags on return. We would only hit a problem if the function issuing mprotect(sp, PROT_MTE) on and its callers already assumed a PROT_MTE stack. Without PROT_MTE, an STG would be write-ignore, so subsequently turning it on would lead to a mismatch between the pointer and the allocation tags. So PROT_MTE turning on should happen very early in the user process startup code before any code with stack tagging enabled. Whether you reach the top of the stack with such mprotect() doesn't really matter since up to that point there should not be any use of stack tagging. If that's not possible, for example the glibc code setting up the stack was compiled to stack tagging itself, the kernel would have to enable it when the user process starts. However, I'd only do this based on some ELF note. -- Catalin