On Wednesday, July 6, 2016 3:25:20 PM CEST Kees Cook wrote: > This is the start of porting PAX_USERCOPY into the mainline kernel. This > is the first set of features, controlled by CONFIG_HARDENED_USERCOPY. The > work is based on code by PaX Team and Brad Spengler, and an earlier port > from Casey Schaufler. Additional non-slab page tests are from Rik van Riel. > > This patch contains the logic for validating several conditions when > performing copy_to_user() and copy_from_user() on the kernel object > being copied to/from: > - address range doesn't wrap around > - address range isn't NULL or zero-allocated (with a non-zero copy size) > - if on the slab allocator: > - object size must be less than or equal to copy size (when check is > implemented in the allocator, which appear in subsequent patches) > - otherwise, object must not span page allocations > - if on the stack > - object must not extend before/after the current process task > - object must be contained by the current stack frame (when there is > arch/build support for identifying stack frames) > - object must not overlap with kernel text > > Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx> Nice! I have a few further thoughts, most of which have probably been considered before: > +static inline const char *check_bogus_address(const void *ptr, unsigned long n) > +{ > + /* Reject if object wraps past end of memory. */ > + if (ptr + n < ptr) > + return "<wrapped address>"; > + > + /* Reject if NULL or ZERO-allocation. */ > + if (ZERO_OR_NULL_PTR(ptr)) > + return "<null>"; > + > + return NULL; > +} This checks against address (void*)16, but I guess on most architectures the lowest possible kernel address is much higher. While there may not be much that to exploit if the expected kernel address points to userland, forbidding any obviously incorrect address that is outside of the kernel may be easier. Even on architectures like s390 that start the kernel memory at (void *)0x0, the lowest address to which we may want to do a copy_to_user would be much higher than (void*)0x16. > + > + /* Allow kernel rodata region (if not marked as Reserved). */ > + if (ptr >= (const void *)__start_rodata && > + end <= (const void *)__end_rodata) > + return NULL; Should we explicitly forbid writing to rodata, or is it enough to rely on page protection here? > + /* Allow kernel bss region (if not marked as Reserved). */ > + if (ptr >= (const void *)__bss_start && > + end <= (const void *)__bss_stop) > + return NULL; accesses to .data/.rodata/.bss are probably not performance critical, so we could go further here and check the kallsyms table to ensure that we are not spanning multiple symbols here. For stuff that is performance critical, should there be a way to opt out of the checks, or do we assume it already uses functions that avoid the checks? I looked at the file and network I/O path briefly and they seem to use kmap_atomic() to get to the user pages at least in some of the common cases (but I may well be missing important ones). Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html