On 5/21/20 12:39 PM, Thomas Gleixner wrote: > Ming, > > Ming Lei <ming.lei@xxxxxxxxxx> writes: >> On Thu, May 21, 2020 at 10:13:59AM +0200, Thomas Gleixner wrote: >>> Ming Lei <ming.lei@xxxxxxxxxx> writes: >>>> On Thu, May 21, 2020 at 12:14:18AM +0200, Thomas Gleixner wrote: >>>> - otherwise, the kthread just retries and retries to allocate & release, >>>> and sooner or later, its time slice is consumed, and migrated out, and the >>>> cpu hotplug handler will get chance to run and move on, then the cpu is >>>> shutdown. >>> >>> 1) This is based on the assumption that the kthread is in the SCHED_OTHER >>> scheduling class. Is that really a valid assumption? >> >> Given it is unlikely path, we can add msleep() before retrying when INACTIVE bit >> is observed by current thread, and this way can avoid spinning and should work >> for other schedulers. > > That should work, but pretty is something else > >>> >>> 2) What happens in the following scenario: >>> >>> unplug >>> >>> mq_offline >>> set_ctx_inactive() >>> drain_io() >>> >>> io_kthread() >>> try_queue() >>> wait_on_ctx() >>> >>> Can this happen and if so what will wake up that thread? >> >> drain_io() releases all tag of this hctx, then wait_on_ctx() will be waken up >> after any tag is released. > > drain_io() is already done ... > > So looking at that thread function: > > static int io_sq_thread(void *data) > { > struct io_ring_ctx *ctx = data; > > while (...) { > .... > to_submit = io_sqring_entries(ctx); > > --> preemption > > hotplug runs > mq_offline() > set_ctx_inactive(); > drain_io(); > finished(); > > --> thread runs again > > mutex_lock(&ctx->uring_lock); > ret = io_submit_sqes(ctx, to_submit, NULL, -1, true); > mutex_unlock(&ctx->uring_lock); > > .... > > if (!to_submit || ret == -EBUSY) > ... > wait_on_ctx(); > > Can this happen or did drain_io() already take care of the 'to_submit' > items and the call to io_submit_sqes() turns into a zero action ? > > If the above happens then nothing will wake it up because the context > draining is done and finished. Again, this is mixing up io_uring and blk-mq. Maybe it's the fact that both use 'ctx' that makes this confusing. On the blk-mq side, the 'ctx' is the per-cpu queue context, for io_uring it's the io_uring instance. io_sq_thread() doesn't care about any sort of percpu mappings, it's happy as long as it'll keep running regardless of whether or not the optional pinned CPU is selected and then offlined. -- Jens Axboe