On 27/05/2020 01:04, Jann Horn wrote: > On Tue, May 26, 2020 at 8:11 PM Pavel Begunkov <asml.silence@xxxxxxxxx> wrote: >> It looks like taking ->uring_lock should work like kind of grace >> period for struct files_struct and io_uring_flush(), and that would >> solve the race with "fcheck(ctx->ring_fd) == ctx->ring_file". >> >> Can you take a look? If you like it, I'll send a proper patch >> and a bunch of cleanups on top. >> >> >> diff --git a/fs/io_uring.c b/fs/io_uring.c >> index a3dbd5f40391..012af200dc72 100644 >> --- a/fs/io_uring.c >> +++ b/fs/io_uring.c >> @@ -5557,12 +5557,11 @@ static int io_grab_files(struct io_kiocb *req) >> * the fd has changed since we started down this path, and disallow >> * this operation if it has. >> */ >> - if (fcheck(ctx->ring_fd) == ctx->ring_file) { >> - list_add(&req->inflight_entry, &ctx->inflight_list); >> - req->flags |= REQ_F_INFLIGHT; >> - req->work.files = current->files; >> - ret = 0; >> - } >> + list_add(&req->inflight_entry, &ctx->inflight_list); >> + req->flags |= REQ_F_INFLIGHT; >> + req->work.files = current->files; >> + ret = 0; >> + >> spin_unlock_irq(&ctx->inflight_lock); >> rcu_read_unlock(); >> >> @@ -7479,6 +7478,10 @@ static int io_uring_release(struct inode *inode, struct >> file *file) >> static void io_uring_cancel_files(struct io_ring_ctx *ctx, >> struct files_struct *files) >> { >> + /* wait all submitters that can race for @files */ >> + mutex_lock(&ctx->uring_lock); >> + mutex_unlock(&ctx->uring_lock); >> + >> while (!list_empty_careful(&ctx->inflight_list)) { >> struct io_kiocb *cancel_req = NULL, *req; >> DEFINE_WAIT(wait); > > First off: You're removing a check in io_grab_files() without changing > the comment that describes the check; and the new comment you're > adding in io_uring_cancel_files() is IMO too short to be useful. Obviously, it was stripped down to show the idea, nobody is talking about commiting it as is. I hoped Jens remembers it well enough to understand. Let me describe it in more details then: > > I'm trying to figure out how your change is supposed to work, and I > don't get it. If a submitter is just past fdget() (at which point no > locks are held), the ->flush() caller can instantly take and drop the > ->uring_lock, and then later the rest of the submission path will grab > an unprotected pointer to the files_struct. Am I missing something? old = tsk->files; task_lock(tsk); tsk->files = files; task_unlock(tsk); put_files_struct(old); (i.e. ->flush(old)) It's from reset_files_struct(), and I presume the whole idea of io_uring->flush() is to protect against racing for similarly going away @old files. I.e. ensuring of not having io_uring requests holding @old files. The only place, where current->files are accessed and copied by io_uring, is io_grab_files(), which is called in the submission path. And the whole submission path is done under @uring_mtx. For your case, the submitter will take @uring_mtx only after this lock/unlock happened, so it won't see old files (happens-before by locking mutex). The idea behind lock/unlock is that - submitters already locked @uring_mtx (i.e. started submission) before the lock/unlock, are waited for in the flush. These can potentially access @old. - submitters, that came after the lock/unlock, won't see @old files. So, no new request associated with @old can appear after that. All's left is to deal with already submitted requests, that's done by the rest of io_uring_cancel_files(). The thing I don't know is why current->files is originally accessed without protection in io_grab_files(), but presumably rcu_read_lock() there is for that reason. Do you see anything suspicious from the description? -- Pavel Begunkov