On Thu, Jul 7, 2016 at 4:01 AM, Arnd Bergmann <arnd@xxxxxxxx> wrote: > On Wednesday, July 6, 2016 3:25:20 PM CEST Kees Cook wrote: >> This is the start of porting PAX_USERCOPY into the mainline kernel. This >> is the first set of features, controlled by CONFIG_HARDENED_USERCOPY. The >> work is based on code by PaX Team and Brad Spengler, and an earlier port >> from Casey Schaufler. Additional non-slab page tests are from Rik van Riel. >> >> This patch contains the logic for validating several conditions when >> performing copy_to_user() and copy_from_user() on the kernel object >> being copied to/from: >> - address range doesn't wrap around >> - address range isn't NULL or zero-allocated (with a non-zero copy size) >> - if on the slab allocator: >> - object size must be less than or equal to copy size (when check is >> implemented in the allocator, which appear in subsequent patches) >> - otherwise, object must not span page allocations >> - if on the stack >> - object must not extend before/after the current process task >> - object must be contained by the current stack frame (when there is >> arch/build support for identifying stack frames) >> - object must not overlap with kernel text >> >> Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx> > > Nice! > > I have a few further thoughts, most of which have probably been > considered before: > >> +static inline const char *check_bogus_address(const void *ptr, unsigned long n) >> +{ >> + /* Reject if object wraps past end of memory. */ >> + if (ptr + n < ptr) >> + return "<wrapped address>"; >> + >> + /* Reject if NULL or ZERO-allocation. */ >> + if (ZERO_OR_NULL_PTR(ptr)) >> + return "<null>"; >> + >> + return NULL; >> +} > > This checks against address (void*)16, but I guess on most architectures the > lowest possible kernel address is much higher. While there may not be much > that to exploit if the expected kernel address points to userland, forbidding > any obviously incorrect address that is outside of the kernel may be easier. > > Even on architectures like s390 that start the kernel memory at (void *)0x0, > the lowest address to which we may want to do a copy_to_user would be much > higher than (void*)0x16. Yeah, that's worth exploring, but given the shenanigans around set_fs(), I'd like to leave this as-is, and we can add to these checks as we remove as much of the insane usage of set_fs(). >> + >> + /* Allow kernel rodata region (if not marked as Reserved). */ >> + if (ptr >= (const void *)__start_rodata && >> + end <= (const void *)__end_rodata) >> + return NULL; > > Should we explicitly forbid writing to rodata, or is it enough to > rely on page protection here? Hm, interesting. That's a very small check to add. My knee-jerk is to just leave it up to page protection. I'm on the fence. :) > >> + /* Allow kernel bss region (if not marked as Reserved). */ >> + if (ptr >= (const void *)__bss_start && >> + end <= (const void *)__bss_stop) >> + return NULL; > > accesses to .data/.rodata/.bss are probably not performance critical, > so we could go further here and check the kallsyms table to ensure > that we are not spanning multiple symbols here. Oh, interesting! Yeah, would you be willing to put together that patch and test it? I wonder if there are any cases where there are legitimate usercopys across multiple symbols. > For stuff that is performance critical, should there be a way to > opt out of the checks, or do we assume it already uses functions > that avoid the checks? I looked at the file and network I/O path > briefly and they seem to use kmap_atomic() to get to the user pages > at least in some of the common cases (but I may well be missing > important ones). I don't want to start with an exemption here, so until such a case is found, I'd rather leave this as-is. That said, the primary protection here tends to be buggy lengths (which is why put/get_user() is untouched). For constant-sized copies, some checks could be skipped. In the second part of this protection (what I named CONFIG_HARDENED_USERCOPY_WHITELIST in the RFC version of this series), there are cases where we want to skip the whitelist checking since it is for a constant-sized copy the code understands is okay to pull out of an otherwise disallowed allocator object. -Kees -- Kees Cook Chrome OS & Brillo Security -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>