On Mon, 27 Apr 2020 at 18:00, Sami Tolvanen <samitolvanen@xxxxxxxxxx> wrote: > > This patch series adds support for Clang's Shadow Call Stack > (SCS) mitigation, which uses a separately allocated shadow stack > to protect against return address overwrites. More information > can be found here: > > https://clang.llvm.org/docs/ShadowCallStack.html > > SCS provides better protection against traditional buffer > overflows than CONFIG_STACKPROTECTOR_*, but it should be noted > that SCS security guarantees in the kernel differ from the ones > documented for user space. The kernel must store addresses of > shadow stacks in memory, which means an attacker capable of > reading and writing arbitrary memory may be able to locate them > and hijack control flow by modifying the shadow stacks. > > SCS is currently supported only on arm64, where the compiler > requires the x18 register to be reserved for holding the current > task's shadow stack pointer. > > With -fsanitize=shadow-call-stack, the compiler injects > instructions to all non-leaf C functions to store the return > address to the shadow stack, and unconditionally load it again > before returning. As a result, SCS is incompatible with features > that rely on modifying function return addresses in the kernel > stack to alter control flow. A copy of the return address is > still kept in the kernel stack for compatibility with stack > unwinding, for example. > > SCS has a minimal performance overhead, but allocating > shadow stacks increases kernel memory usage. The feature is > therefore mostly useful on hardware that lacks support for PAC > instructions. > > Changes in v13: > - Changed thread_info::shadow_call_stack to a base address and > an offset instead, and removed the now unneeded __scs_base() > and scs_save(). > - Removed alignment from the kmem_cache and static allocations. > - Removed the task_set_scs() helper function. > - Moved the assembly code for loading and storing the offset in > thread_info to scs_load/save macros. > - Added offset checking to scs_corrupted(). > - Switched to cmpxchg_relaxed() in scs_check_usage(). > OK, so one thing that came up in an offline discussion about SCS is the way it interacts with the vmap'ed stack. The vmap'ed stack is great for robustness, but it only works if things don't explode for other reasons in the mean time. This means the ordinary-to-shadow-call-stack size ratio should be chosen such that it is *really* unlikely you could ever overflow the shadow call stack and corrupt another task's call stack before hitting the vmap stack's guard region. Alternatively, I wonder if there is a way we could let the SCS and ordinary stack share the [bottom of] the vmap'ed region. That would give rather nasty results if the ordinary stack overflows into the SCS, but for cases where we really recurse out of control, we could catch this occurrence on either stack, whichever one occurs first. And the nastiness -when it does occur- will not corrupt any state beyond the stack of the current task.