On Wed, Sep 2, 2020 at 1:11 AM Jens Axboe <axboe@xxxxxxxxx> wrote: > The restriction of needing fixed files for SQPOLL is problematic, and > prevents/inhibits several valid uses cases. > > There's no real good reason for us not to allow it, except we need to > have the sqpoll thread inherit current->files from the task that setup > the ring. We can't easily do that, since we'd introduce a circular > reference by holding on to our own file table. > > If we wait for the sqpoll thread to exit when the ring fd is closed, > then we can safely reference the task files_struct without holding > a reference to it. And once we inherit that in the SQPOLL thread, we > can support non-fixed files for SQPOLL. [...] > diff --git a/fs/io_uring.c b/fs/io_uring.c [...] > + /* > + * For SQPOLL usage - no reference is held to this file table, we > + * rely on fops->flush() and our callback there waiting for the users > + * to finish. > + */ > + struct files_struct *sqo_files; [...] > @@ -6621,6 +6622,10 @@ static int io_sq_thread(void *data) > > old_cred = override_creds(ctx->creds); > > + task_lock(current); > + current->files = ctx->sqo_files; > + task_unlock(current); [...] > @@ -7549,6 +7557,13 @@ static int io_sq_offload_create(struct io_ring_ctx *ctx, > if (!capable(CAP_SYS_ADMIN)) > goto err; > > + /* > + * We will exit the sqthread before current exits, so we can > + * avoid taking a reference here and introducing weird > + * circular dependencies on the files table. > + */ > + ctx->sqo_files = current->files; [...] > @@ -8239,8 +8254,10 @@ static int io_uring_flush(struct file *file, void *data) > /* > * If the task is going away, cancel work it may have pending > */ > - if (fatal_signal_pending(current) || (current->flags & PF_EXITING)) > + if (fatal_signal_pending(current) || (current->flags & PF_EXITING)) { > + io_sq_thread_stop(ctx); > io_wq_cancel_cb(ctx->io_wq, io_cancel_task_cb, current, true); > + } What happens when the uring setup syscall fails around io_uring_get_fd() (after the sq offload thread was started, but before an fd was installed)? Will that also properly wait for the sq_thread to go away before returning from the syscall (at which point the files_struct could disappear)?