Re: [REVIEW PATCH v2] ptrace: reintroduce usage of subjective credentials in ptrace_has_cap()

Kees Cook <keescook@xxxxxxxxxxxx> · Thu, 16 Jan 2020 18:29:26 -0800

On Thu, Jan 16, 2020 at 11:45:18PM +0100, Christian Brauner wrote:
> Commit 69f594a38967 ("ptrace: do not audit capability check when outputing /proc/pid/stat")
> introduced the ability to opt out of audit messages for accesses to
> various proc files since they are not violations of policy.
> While doing so it somehow switched the check from ns_capable() to
> has_ns_capability{_noaudit}(). That means it switched from checking the
> subjective credentials of the task to using the objective credentials. I
> couldn't find the original lkml thread and so I don't know why this switch
> was done. But it seems wrong since ptrace_has_cap() is currently only used
> in ptrace_may_access(). And it's used to check whether the calling task
> (subject) has the CAP_SYS_PTRACE capability in the provided user namespace
> to operate on the target task (object). According to the cred.h comments
> this would mean the subjective credentials of the calling task need to be
> used.

I don't follow this description. As far as I can see, both the current
code and your patch end up using current's cred, yes? I'm not following
the subjective/objective change mentioned here.

Before:
bool has_ns_capability(struct task_struct *t,
                       struct user_namespace *ns, int cap)
{
        int ret;

        rcu_read_lock();
        ret = security_capable(__task_cred(t), ns, cap, CAP_OPT_NONE);
        rcu_read_unlock();

        return (ret == 0);
}
...
		return has_ns_capability(current, ns, CAP_SYS_PTRACE)

After:
	const struct cred *cred = current_cred(), ...
...
		return security_capable(cred, ns, CAP_SYS_PTRACE, CAP_OPT_NOAUDIT);

The cred passed to security_capable() is the subject before and after.

> This switches it to use security_capable() because we only call
> ptrace_has_cap() in ptrace_may_access() and in there we already have a
> stable reference to the calling tasks creds under cred_guard_mutex so
> there's no need to go through another series of dereferences and rcu
> locking done in ns_capable{_noaudit}().

This makes sense to me -- now there's no possible race on the cred
changing between the two ptrace_has_cap() checks, yes?

However, I'm still trying to see where cred_guard_mutex() comes into
play for callers of ptrace_may_access(). I see it for the object
("task" arg in ptrace_may_access()), but if this is dealing with the cred
on current, it's just the RCU read lock protecting it (which I think is
fine here), but seems confusing in the commit log.

> As one example where this might be particularly problematic, Jann pointed
> out that in combination with the upcoming IORING_OP_OPENAT feature, this
> bug might allow unprivileged users to bypass the capability checks while
> asynchronously opening files like /proc/*/mem, because the capability
> checks for this would be performed against kernel credentials.

As in, winning a race between the two ptrace_has_cap() calls across a
cred transition?

> Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
> Cc: Eric Paris <eparis@xxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> Reviewed-by: Serge Hallyn <serge@xxxxxxxxxx>
> Reviewed-by: Jann Horn <jannh@xxxxxxxxxx>
> Fixes: 69f594a38967 ("ptrace: do not audit capability check when outputing /proc/pid/stat")
> Signed-off-by: Christian Brauner <christian.brauner@xxxxxxxxxx>
> ---
>  kernel/ptrace.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index cb9ddcc08119..d146133e97f1 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -264,12 +264,13 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state)
>  	return ret;
>  }
>  
> -static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
> +static int ptrace_has_cap(const struct cred *cred, struct user_namespace *ns,
> +			  unsigned int mode)
>  {
>  	if (mode & PTRACE_MODE_NOAUDIT)
> -		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
> +		return security_capable(cred, ns, CAP_SYS_PTRACE, CAP_OPT_NOAUDIT);
>  	else
> -		return has_ns_capability(current, ns, CAP_SYS_PTRACE);
> +		return security_capable(cred, ns, CAP_SYS_PTRACE, CAP_OPT_NONE);
>  }

Style nit -- can we just make this a single invocation of
security_capable(), something like:

	return security_capable(cred, ns, CAP_SYS_PTRACE,
				mode & PTRACE_MODE_NOAUDIT
					?  CAP_OPT_NOAUDIT,
					: CAP_OPT_NONE) == 0;

Obviously not required, but the longer if hurts my eyes. ;)

>  
>  /* Returns 0 on success, -errno on denial. */
> @@ -321,7 +322,7 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
>  	    gid_eq(caller_gid, tcred->sgid) &&
>  	    gid_eq(caller_gid, tcred->gid))
>  		goto ok;
> -	if (ptrace_has_cap(tcred->user_ns, mode))
> +	if (ptrace_has_cap(cred, tcred->user_ns, mode))
>  		goto ok;
>  	rcu_read_unlock();
>  	return -EPERM;
> @@ -340,7 +341,7 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
>  	mm = task->mm;
>  	if (mm &&
>  	    ((get_dumpable(mm) != SUID_DUMP_USER) &&
> -	     !ptrace_has_cap(mm->user_ns, mode)))
> +	     !ptrace_has_cap(cred, mm->user_ns, mode)))
>  	    return -EPERM;
>  
>  	return security_ptrace_access_check(task, mode);
> 
> base-commit: b3a987b0264d3ddbb24293ebff10eddfc472f653
> -- 
> 2.25.0
> 

So, I think this change looks correct, but I find the commit subject
and log confusing (perhaps because I am dense) and misleading (again,
perhaps because I am dense).

-- 
Kees Cook