I added a CC: linux-security-module@vger Hi, Keith, Keith Busch <kbusch@xxxxxxxx> writes: > From: Keith Busch <kbusch@xxxxxxxxxx> > > The uring_cmd operation is often used for privileged actions, so drivers > subscribing to this interface check capable() for each command. The > capable() function is not fast path friendly for many kernel configs, > and this can really harm performance. Stash the capable sys admin > attribute in the io_uring context and set a new issue_flag for the > uring_cmd interface. I have a few questions. What privileged actions are performance sensitive? I would hope that anything requiring privileges would not be in a fast path (but clearly that's not the case). What performance benefits did you measure with this patch set in place (and on what workloads)? What happens when a ring fd is passed to another process? Finally, as Jens mentioned, I would expect dropping priviliges to, you know, drop privileges. I don't think a commit message is going to be enough documentation for a change like this. Cheers, Jeff > > Signed-off-by: Keith Busch <kbusch@xxxxxxxxxx> > --- > include/linux/io_uring_types.h | 4 ++++ > io_uring/io_uring.c | 1 + > io_uring/uring_cmd.c | 2 ++ > 3 files changed, 7 insertions(+) > > diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h > index bebab36abce89..d64d6916753f0 100644 > --- a/include/linux/io_uring_types.h > +++ b/include/linux/io_uring_types.h > @@ -36,6 +36,9 @@ enum io_uring_cmd_flags { > /* set when uring wants to cancel a previously issued command */ > IO_URING_F_CANCEL = (1 << 11), > IO_URING_F_COMPAT = (1 << 12), > + > + /* ring validated as CAP_SYS_ADMIN capable */ > + IO_URING_F_SYS_ADMIN = (1 << 13), > }; > > struct io_wq_work_node { > @@ -240,6 +243,7 @@ struct io_ring_ctx { > unsigned int poll_activated: 1; > unsigned int drain_disabled: 1; > unsigned int compat: 1; > + unsigned int sys_admin: 1; > > struct task_struct *submitter_task; > struct io_rings *rings; > diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c > index 1d254f2c997de..4aa10b64f539e 100644 > --- a/io_uring/io_uring.c > +++ b/io_uring/io_uring.c > @@ -3980,6 +3980,7 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, > ctx->syscall_iopoll = 1; > > ctx->compat = in_compat_syscall(); > + ctx->sys_admin = capable(CAP_SYS_ADMIN); > if (!ns_capable_noaudit(&init_user_ns, CAP_IPC_LOCK)) > ctx->user = get_uid(current_user()); > > diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c > index 8a38b9f75d841..764f0e004aa00 100644 > --- a/io_uring/uring_cmd.c > +++ b/io_uring/uring_cmd.c > @@ -164,6 +164,8 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags) > issue_flags |= IO_URING_F_CQE32; > if (ctx->compat) > issue_flags |= IO_URING_F_COMPAT; > + if (ctx->sys_admin) > + issue_flags |= IO_URING_F_SYS_ADMIN; > if (ctx->flags & IORING_SETUP_IOPOLL) { > if (!file->f_op->uring_cmd_iopoll) > return -EOPNOTSUPP;