Re: [PATCH] proc: restrict kernel stack dumps to root

Kees Cook <keescook@xxxxxxxxxxxx> · Wed, 12 Sep 2018 15:27:53 -0700

On Wed, Sep 12, 2018 at 8:29 AM, Jann Horn <jannh@xxxxxxxxxx> wrote:
> +linux-api, I guess
>
> On Tue, Sep 11, 2018 at 8:39 PM Jann Horn <jannh@xxxxxxxxxx> wrote:
>>
>> Restrict the ability to inspect kernel stacks of arbitrary tasks to root
>> in order to prevent a local attacker from exploiting racy stack unwinding
>> to leak kernel task stack contents.
>> See the added comment for a longer rationale.
>>
>> There don't seem to be any users of this userspace API that can't
>> gracefully bail out if reading from the file fails. Therefore, I believe
>> that this change is unlikely to break things.
>> In the case that this patch does end up needing a revert, the next-best
>> solution might be to fake a single-entry stack based on wchan.
>>
>> Fixes: 2ec220e27f50 ("proc: add /proc/*/stack")
>> Cc: stable@xxxxxxxxxxxxxxx
>> Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
>> ---
>>  fs/proc/base.c | 14 ++++++++++++++
>>  1 file changed, 14 insertions(+)
>>
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index ccf86f16d9f0..7e9f07bf260d 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -407,6 +407,20 @@ static int proc_pid_stack(struct seq_file *m, struct pid_namespace *ns,
>>         unsigned long *entries;
>>         int err;
>>
>> +       /*
>> +        * The ability to racily run the kernel stack unwinder on a running task
>> +        * and then observe the unwinder output is scary; while it is useful for
>> +        * debugging kernel issues, it can also allow an attacker to leak kernel
>> +        * stack contents.
>> +        * Doing this in a manner that is at least safe from races would require
>> +        * some work to ensure that the remote task can not be scheduled; and
>> +        * even then, this would still expose the unwinder as local attack
>> +        * surface.
>> +        * Therefore, this interface is restricted to root.
>> +        */
>> +       if (!file_ns_capable(m->file, &init_user_ns, CAP_SYS_ADMIN))
>> +               return -EACCES;

In the past, we've avoided hard errors like this in favor of just
censoring the output. Do we want to be more cautious here? (i.e.
return 0 or a fuller seq_printf(m, "[<0>] privileged\n"); return 0;)

>> +
>>         entries = kmalloc_array(MAX_STACK_TRACE_DEPTH, sizeof(*entries),
>>                                 GFP_KERNEL);
>>         if (!entries)
>> --
>> 2.19.0.rc2.392.g5ba43deb5a-goog
>>

-Kees

-- 
Kees Cook
Pixel Security