On 9/8/20 8:10 AM, Stefano Garzarella wrote: > On Tue, Sep 08, 2020 at 07:57:08AM -0600, Jens Axboe wrote: >> On 9/8/20 7:44 AM, Stefano Garzarella wrote: >>> Hi Jens, >>> >>> On Thu, Aug 27, 2020 at 04:58:31PM +0200, Stefano Garzarella wrote: >>>> This patch adds a new IORING_SETUP_R_DISABLED flag to start the >>>> rings disabled, allowing the user to register restrictions, >>>> buffers, files, before to start processing SQEs. >>>> >>>> When IORING_SETUP_R_DISABLED is set, SQE are not processed and >>>> SQPOLL kthread is not started. >>>> >>>> The restrictions registration are allowed only when the rings >>>> are disable to prevent concurrency issue while processing SQEs. >>>> >>>> The rings can be enabled using IORING_REGISTER_ENABLE_RINGS >>>> opcode with io_uring_register(2). >>>> >>>> Suggested-by: Jens Axboe <axboe@xxxxxxxxx> >>>> Reviewed-by: Kees Cook <keescook@xxxxxxxxxxxx> >>>> Signed-off-by: Stefano Garzarella <sgarzare@xxxxxxxxxx> >>>> --- >>>> v4: >>>> - fixed io_uring_enter() exit path when ring is disabled >>>> >>>> v3: >>>> - enabled restrictions only when the rings start >>>> >>>> RFC v2: >>>> - removed return value of io_sq_offload_start() >>>> --- >>>> fs/io_uring.c | 52 ++++++++++++++++++++++++++++++----- >>>> include/uapi/linux/io_uring.h | 2 ++ >>>> 2 files changed, 47 insertions(+), 7 deletions(-) >>>> >>>> diff --git a/fs/io_uring.c b/fs/io_uring.c >>>> index 5f62997c147b..b036f3373fbe 100644 >>>> --- a/fs/io_uring.c >>>> +++ b/fs/io_uring.c >>>> @@ -226,6 +226,7 @@ struct io_restriction { >>>> DECLARE_BITMAP(sqe_op, IORING_OP_LAST); >>>> u8 sqe_flags_allowed; >>>> u8 sqe_flags_required; >>>> + bool registered; >>>> }; >>>> >>>> struct io_ring_ctx { >>>> @@ -7497,8 +7498,8 @@ static int io_init_wq_offload(struct io_ring_ctx *ctx, >>>> return ret; >>>> } >>>> >>>> -static int io_sq_offload_start(struct io_ring_ctx *ctx, >>>> - struct io_uring_params *p) >>>> +static int io_sq_offload_create(struct io_ring_ctx *ctx, >>>> + struct io_uring_params *p) >>>> { >>>> int ret; >>>> >>>> @@ -7532,7 +7533,6 @@ static int io_sq_offload_start(struct io_ring_ctx *ctx, >>>> ctx->sqo_thread = NULL; >>>> goto err; >>>> } >>>> - wake_up_process(ctx->sqo_thread); >>>> } else if (p->flags & IORING_SETUP_SQ_AFF) { >>>> /* Can't have SQ_AFF without SQPOLL */ >>>> ret = -EINVAL; >>>> @@ -7549,6 +7549,12 @@ static int io_sq_offload_start(struct io_ring_ctx *ctx, >>>> return ret; >>>> } >>>> >>>> +static void io_sq_offload_start(struct io_ring_ctx *ctx) >>>> +{ >>>> + if ((ctx->flags & IORING_SETUP_SQPOLL) && ctx->sqo_thread) >>>> + wake_up_process(ctx->sqo_thread); >>>> +} >>>> + >>>> static inline void __io_unaccount_mem(struct user_struct *user, >>>> unsigned long nr_pages) >>>> { >>>> @@ -8295,6 +8301,9 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit, >>>> if (!percpu_ref_tryget(&ctx->refs)) >>>> goto out_fput; >>>> >>>> + if (ctx->flags & IORING_SETUP_R_DISABLED) >>>> + goto out_fput; >>>> + >>> >>> While writing the man page paragraph, I discovered that if the rings are >>> disabled I returned ENXIO error in io_uring_enter(), coming from the previous >>> check. >>> >>> I'm not sure it is the best one, maybe I can return EBADFD or another >>> error. >>> >>> What do you suggest? >> >> EBADFD seems indeed the most appropriate - the fd is valid, but not in the >> right state to do this. > > Yeah, the same interpretation as mine! > > Also, in io_uring_register() I'm returning EINVAL if the rings are not > disabled and the user wants to register restrictions. > Maybe also in this case I can return EBADFD. Yes let's do that, EINVAL is always way too overloaded, and it makes sense to use EBADFD consistently for any operation related to that. -- Jens Axboe