On Thu, Jan 16, 2020 at 06:29:26PM -0800, Kees Cook wrote: > On Thu, Jan 16, 2020 at 11:45:18PM +0100, Christian Brauner wrote: > > As one example where this might be particularly problematic, Jann pointed > > out that in combination with the upcoming IORING_OP_OPENAT feature, this > > bug might allow unprivileged users to bypass the capability checks while > > asynchronously opening files like /proc/*/mem, because the capability > > checks for this would be performed against kernel credentials. To follow up on this part of your mail. No, afaict, it's not aboutwinning a race. It's way simpler... When io uring creates a new kernel context it records the subjective credentials of the caller: ctx = io_ring_ctx_alloc(p); if (!ctx) { if (account_mem) io_unaccount_mem(user, ring_pages(p->sq_entries, p->cq_entries)); free_uid(user); return -ENOMEM; } ctx->compat = in_compat_syscall(); ctx->account_mem = account_mem; ctx->user = user; ------> ctx->creds = get_current_cred(); <------ Later on, when it starts to do work it creates a kernel thread: ctx->sqo_thread = kthread_create_on_cpu(io_sq_thread, ctx, cpu, "io_uring-sq"); } else { ctx->sqo_thread = kthread_create(io_sq_thread, ctx, "io_uring-sq"); } and registers io_sq_thread as "callback". The callback io_sq_thread() runs __with kernel creds__. To prevent this from becoming an issue io_sq_thread() will override the __subjective credentials__ with the callers credentials: old_cred = override_creds(ctx->creds); But ptrace_has_cap() currently looks at __task_cred(current) aka __real_cred__. This means once IORING_OP_OPENAT and IORING_OP_OPENAT2 lands in v5.5-rc6 it is more or less trivial for an unprivileged user to bypass ptrace_may_access(). Christian