Re: [GIT PULL] Support for the io_uring IO interface

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 8, 2019 at 2:55 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> I'm going to run the usual build tests, but also look at the basic
> sanity tests and boot and run them just to be careful before actually
> doing that final "ok pushed out".

While waiting for that, I'm looking at the file pointer refs, because
that was completely buggered in fs/aio.c.

What protects somebody from:

 - io_uring_register(IORING_REGISTER_FILES);

 - start async IO

 - io_uring_register(IORING_UNREGISTER_FILES);

and now it had better synchronize everything.

It looks like it migth work due to

 (a) the mutex_lock(&ctx->uring_lock) around registration

 (b) the wait_for_completion(&ctx->ctx_done) in __io_uring_register
presumably waits for each outstanding request.

HOWEVER.

The io_kiocb reference counting seems to be the *exact* same bogus
reference counting that fs/aio.c had, with the magical "zero means it
was never initialized and counts as one" handling, which was buggy in
fs/aio.c too, and caused serious problems with races between request
creation (the "synchronous" part) and requests actually being finished
asynchronously (the "completion" part).

IOW, I can already tell that the reference counting looks suspicious with

  static void io_free_req(struct io_kiocb *req)
  {
        if (!refcount_read(&req->refs) || refcount_dec_and_test(&req->refs)) {

where that whole first "oh, a zero refcount is magic" handling looks
*very* suspicious. It basically says "some ops don't do references
properly at all".

This is *exactly* the same bogosity that fds/aio.c had, and it has
*exactly* the same pattern, with the "poll" code doing

        /* one for removal from waitqueue, one for this function */
        refcount_set(&req->refs, 2);

and then everybody else seems to initialize req->refs to zero at
allocation time, because they magically have no lifetime rules for the
req.

It was bogus garbage in fs/aio.c, and honestly, looking at how much of
the logic looks suspiciously very similar jhere, I suspect it's bogus
garbage here too.

In fact, all the setup code looks suspiciously similar in general. It
has the same broken "prep_rw" and "complete_rw" thing which was racy
with files opened with O_DIRECT "competing" their IO while still
holding i_rwsem on the inode, and then possibly things getting
dropped.

I'm getting the very strong signal that this code had all the logic
taken from fs/aio.c, bugs and all.

Al has patches to fix the aio.c cases. I very much suspect some very
similar patches are needed in fs/io_ring.c.

                        Linus



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux