Re: [PATCHv2 1/1] Documentation: describe how to add a system call

Kees Cook <keescook@xxxxxxxxxxxx> · Fri, 31 Jul 2015 11:56:06 -0700

On Thu, Jul 30, 2015 at 6:02 PM, Josh Triplett <josh@xxxxxxxxxxxxxxxx> wrote:
> On Thu, Jul 30, 2015 at 01:03:43PM -0700, Kees Cook wrote:
>> On Thu, Jul 30, 2015 at 12:04 PM, Josh Triplett <josh@xxxxxxxxxxxxxxxx> wrote:
>> > On Thu, Jul 30, 2015 at 11:21:54AM -0700, Kees Cook wrote:
>> >> I like this, it's a good description of both options. I'm still biased
>> >> about the approach: I prefer flags, since pointers to user structures
>> >> complicate syscall filtering. ;)
>> >
>> > Seems like we should do two things to make that easier:
>> >
>> > 1) Create a standardized kernel mechanism for parameter-struct handling,
>> >    implementing the recommendations mentioned here.
>>
>> It's been suggested in the past that nlmsg is appropriate for such a
>> thing, but I remain suspicious. :)
>
> Likewise. :)
>
>> > 2) Integrate into that mechanism a way to filter the resulting parameter
>> >    struct with BPF *after* it has been copied to kernel space (and thus
>> >    can no longer be tampered with).
>>
>> Yeah, this is a irritating part: the structures operated on are copied
>> from userspace adhoc in each syscall. Doing argument checking would
>> mean double copies initially, and perhaps teaching syscalls about
>> optional "already copied" arguments or something as an optimization.
>
> No, double copies can't work for security reasons.  Because otherwise
> you could race the kernel from another thread, substituting different
> values after the check and before the use.

Right, the double copy method would require setting up a per-thread
userspace memory mapping that was read-only from userspace but
writable from kernel space.

> I think the right API looks *roughly* like this:
>
> int _copy_param_struct(size_t kernel_len, void *kernel_struct, size_t user_len, void __user *user_struct)
> {
>         if (user_len > kernel_len)
>                 return -EINVAL;
>         if (user_len && copy_from_user(kernel_struct, user_struct, user_len))
>                 return -EFAULT;
>         if (user_len < kernel_len)
>                 memset(kernel_struct + user_len, 0, kernel_len - user_len);
>         return 0;
> }
>
> #define copy_param_struct(kernel_struct, user_len, user_struct) _copy_param_struct( \
>                 sizeof(*kernel_struct) + BUILD_BUG_ON_ZERO(!__same_type(*kernel_struct, *user_struct)), \
>                 kernel_struct, user_len, user_struct)
>
>
> Then the syscall looks like this:
>
> SYSCALL_DEFINEn(xyzzy, ..., ..., size_t user_params_len, struct xyzzy_params __user *user_params)
> {
>         int ret;
>         struct xyzzy_params params;
>
>         ret = copy_param_struct(&params, user_params_len, user_params);
>         if (ret)
>                 return ret;
>         ...
>
>
> And you could then hook copy_params_struct to add arbitrary additional
> syscall parameter validation.  Bonus if there's some way to make the
> copy and validation occur before the syscall is ever invoked, rather
> than inside the syscall, but that would require adding fancier syscall
> definition mechanisms that autogenerate such code.

The trouble is that the hook for the syscall (both seccomp and ptrace)
happens before the sys_* function executes. So the param extract
suddenly becomes optional. As in, did ptrace/seccomp already extract
the args? If so, use that copy, else copy them out myself now that I
need them, etc.

It's entirely doable, but it's going to require some careful design.

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html