On 12/17/21 12:45 PM, Linus Torvalds wrote: > On Fri, Dec 17, 2021 at 9:00 AM Jens Axboe <axboe@xxxxxxxxx> wrote: >> >> Just a single fix, fixing an issue with the worker creation change that >> was merged last week. > > Hmm. I've pulled, but looking at the result, this is a classic no-no. > > You can't just randomly drop and re-take a lock and sat it's "safe". > > Because I don't think it's necessarily safe at all. > > When you drop the wqe->lock in the middle of io_wqe_dec_running to > create a new worker, it means - for example - that "io_worker_exit()" > can now run immediately on the new worker as far as I can tell. > > So one io_worker_exit() m,ay literally race with another one, where > both are inside that io_wqe_dec_running() at the same time. And then > they both end up doing > > worker->flags = 0; > current->flags &= ~PF_IO_WORKER; > > afterwards in the caller, and not necessarily in the original order. > And then they'll both possible do > > kfree_rcu(worker, rcu); > > which sounds like a disaster. The worker itself calls io_worker_exit(), so it cannot happen from within io_wqe_dec_running for the existing one. And that's really all we care about. The new worker can come and go and we don't really care about it, we know we're within another worker. That said, I totally do agree that this pattern is not a great one and should be avoided if at all possible. This one should be solvable by passing back a "do the cancel" information from io_queue_worker_create(), but that also gets a bit ugly in terms of having three return types essentially... I'll have a think about how to do this in a saner fashion that's more obviously correct. -- Jens Axboe