Reading through this once.. I'll read through it another time tomorrow before merging it, but on the whole it looks good. The concept certainly is the right now, but I have a few questions about some of the details.. On Fri, Feb 26, 2021 at 2:04 PM Jens Axboe <axboe@xxxxxxxxx> wrote: > > io-wq: fix races around manager/worker creation and task exit Where did that + __set_current_state(TASK_RUNNING); in the patch come from? That looks odd. Is it because io_wq_manager() does a set_current_state(TASK_INTERRUPTIBLE); at one point? Regardless, that code looks very very strange. > io_uring: remove the need for relying on an io-wq fallback worker This is coded really oddly. + do { + head = NULL; + work = READ_ONCE(ctx->exit_task_work); + } while (cmpxchg(&ctx->exit_task_work, work, head) != work); Whaa? If you want to write work = xchg(&ctx->exit_task_work, NULL); then just do that. Instead of an insane cmpxchg loop that isn't even well-written. I say that it isn't even well-written, because when you really do want a cmpxchg loop, then realize that cmpxchg() returns the old value, so the proper way to write it is new = whatever; expect = READ_ONCE(ctx->exit_task_work); for (;;) { new->next = expect; // Mumble mumble - this is why you want the cmpxchg was = cmpxchg(&ctx->exit_task_work, expect, new); if (was == expect) break; expect = was; } IOW, that READ_ONCE() of the old value should happen only once - and isn't worth a READ_ONCE() in the first place. There's nothing "read once" about it. But as mentioned, if all you want is an atomic "get and replace with NULL", then just a plain "xchg()" is what you should do. Yes, that cmpxchg loop _works_, but it is really confusing to read that way, when clearly you don't actually need or really want a cmpxchg. (The other cmpxchg loop in the same patch is needed, but does that unnecessary "re-read the value that cmpxchg just returned"). Please explain. Linus