On Fri, Mar 8, 2019 at 2:55 PM Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > I'm going to run the usual build tests, but also look at the basic > sanity tests and boot and run them just to be careful before actually > doing that final "ok pushed out". While waiting for that, I'm looking at the file pointer refs, because that was completely buggered in fs/aio.c. What protects somebody from: - io_uring_register(IORING_REGISTER_FILES); - start async IO - io_uring_register(IORING_UNREGISTER_FILES); and now it had better synchronize everything. It looks like it migth work due to (a) the mutex_lock(&ctx->uring_lock) around registration (b) the wait_for_completion(&ctx->ctx_done) in __io_uring_register presumably waits for each outstanding request. HOWEVER. The io_kiocb reference counting seems to be the *exact* same bogus reference counting that fs/aio.c had, with the magical "zero means it was never initialized and counts as one" handling, which was buggy in fs/aio.c too, and caused serious problems with races between request creation (the "synchronous" part) and requests actually being finished asynchronously (the "completion" part). IOW, I can already tell that the reference counting looks suspicious with static void io_free_req(struct io_kiocb *req) { if (!refcount_read(&req->refs) || refcount_dec_and_test(&req->refs)) { where that whole first "oh, a zero refcount is magic" handling looks *very* suspicious. It basically says "some ops don't do references properly at all". This is *exactly* the same bogosity that fds/aio.c had, and it has *exactly* the same pattern, with the "poll" code doing /* one for removal from waitqueue, one for this function */ refcount_set(&req->refs, 2); and then everybody else seems to initialize req->refs to zero at allocation time, because they magically have no lifetime rules for the req. It was bogus garbage in fs/aio.c, and honestly, looking at how much of the logic looks suspiciously very similar jhere, I suspect it's bogus garbage here too. In fact, all the setup code looks suspiciously similar in general. It has the same broken "prep_rw" and "complete_rw" thing which was racy with files opened with O_DIRECT "competing" their IO while still holding i_rwsem on the inode, and then possibly things getting dropped. I'm getting the very strong signal that this code had all the logic taken from fs/aio.c, bugs and all. Al has patches to fix the aio.c cases. I very much suspect some very similar patches are needed in fs/io_ring.c. Linus