Re: [PATCH 1/2 v2] kprobe: Do not use uaccess functions to access kernel memory that can fault

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 22 Feb 2019 09:43:14 -0800
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Fri, Feb 22, 2019 at 12:35 AM Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> >
> > Or, can we do this?
> >
> > long __probe_user_read(void *dst, const void *src, size_t size)
> > {
> 
> Add a
> 
>         if (!access_ok(src, size))
>                 ret = -EFAULT;
>         else {
> .. do the pagefault_disable() etc ..
>         }

Since kprobes handler runs in IRQ context, we can not use access_ok() in it.
(only on x86 + CONFIG_DEBUG_ATOMIC_SLEEP=y)

In arch/x86/include/asm/uaccess.h:

#define access_ok(addr, size)                                   \
({                                                                      \
        WARN_ON_IN_IRQ();                                               \
        likely(!__range_not_ok(addr, size, user_addr_max()));           \
})

Do we need acccess_ok_inatomic()?

BTW, it seems a bit strange that this WARN_ON_IN_IRQ() is only in x86
access_ok() implementation, since CONFIG_DEBUG_ATOMIC_SLEEP(which defines
WARN_ON_IN_IRQ) doesn't depend on x86, and access_ok() is widely used
in kernel. I think it would be better that each arch provides __access_ok()
and include/linux/uaccess.h provides access_ok() with WARN_ON_IN_IRQ().


> to after the "set_fs()", and it looks good to me. Make it clear that
> yes, this works _only_ for user reads.
> 
> Adn that makes all the games with "kernel_uaccess_faults_ok"
> pointless, so you can just remove them.

OK.

> 
> (note that the "access_ok()" has to come after we've done "set_fs()",
> because it takes the address limit from that).
> 
> Also, since normally we'd expect that we already have USER_DS, it
> might be worthwhile to do this with a wrapper, something along the
> lines of
> 
>         mm_segment_t old_fs = get_fs();
> 
>         if (segment_eq(old_fs, USER_DS))
>               return __normal_probe_user_read();
>         set_fs(USER_DS);
>         ret = __normal_probe_user_read();
>         set_fs(old_fs);
>         return ret;
> 
> and have that __normal_probe_user_read() just do
> 
>         if (!access_ok(src, size))
>                 return -EFAULT;
>         pagefault_disable();
>         ret = __copy_from_user_inatomic(dst, ...);
>         pagefault_enable();
>         return ret ? -EFAULT : 0;
> 
> which looks more obvious.

OK.

> 
> Also, I would suggest that you just make the argument type be "const
> void __user *", since the whole point is that this takes a user
> pointer, and nothing else.

Ah, right.

> Then we should still probably fix up "__probe_kernel_read()" to not
> allow user accesses. The easiest way to do that is actually likely to
> use the "unsafe_get_user()" functions *without* doing a
> uaccess_begin(), which will mean that modern CPU's will simply fault
> on a kernel access to user space.

Or, use __chk_user_ptr(ptr) to check it?

Thank you,

> 
> The nice thing about that is that usually developers will have access
> to exactly those modern boxes, so the people who notice that it
> doesn't work are the right people.
> 
> Alternatively, we should just make it be architecture-specific, so
> that architectures can decide "this address cannot be a kernel
> address" and refuse to do it.
> 
>                 Linus


-- 
Masami Hiramatsu <mhiramat@xxxxxxxxxx>



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux