Re: [GIT PULL] io_uring fix for 5.16-rc6

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Fri, 17 Dec 2021 11:45:50 -0800

On Fri, Dec 17, 2021 at 9:00 AM Jens Axboe <axboe@xxxxxxxxx> wrote:
>
> Just a single fix, fixing an issue with the worker creation change that
> was merged last week.

Hmm. I've pulled, but looking at the result, this is a classic no-no.

You can't just randomly drop and re-take a lock and sat it's "safe".

Because I don't think it's necessarily safe at all.

When you drop the wqe->lock in the middle of io_wqe_dec_running to
create a new worker, it means - for example - that "io_worker_exit()"
can now run immediately on the new worker as far as I can tell.

So one io_worker_exit() m,ay literally race with another one, where
both are inside that io_wqe_dec_running() at the same time. And then
they both end up doing

        worker->flags = 0;
        current->flags &= ~PF_IO_WORKER;

afterwards in the caller, and not necessarily in the original order.
And then they'll both possible do

        kfree_rcu(worker, rcu);

which sounds like a disaster.

Maybe this is all safe and things like the above cannot happen, but it
sure is *not* obviously so. Any time you release a lock in the middle
of holding it, basically *everything* you do afterwards when you
re-take it is suspect.

Don't perpetuate this broken pattern. Because even if it happens to be
safe in this situation, it's _alweays_ broken garbage unless you have
a big and exhaustive comment talking about why re-taking it suddenly
makes everything that follows ok.

The way to do this properly is to either

 (a) make the code you ran under the lock ok to run under the lock

 (b) make the locked region have a *return value* that the code then
uses to decide what to do after it has actually released the lock

But the whole "release and re-take" pattern is broken, broken, broken.

As mentioned, I've pulled this, but I seriously considered unpulling.
Because I think that fix is wrong.

                  Linus