Hi Thomas, On Thu, May 21, 2020 at 10:13:59AM +0200, Thomas Gleixner wrote: > Ming Lei <ming.lei@xxxxxxxxxx> writes: > > On Thu, May 21, 2020 at 12:14:18AM +0200, Thomas Gleixner wrote: > >> When the CPU is finally offlined, i.e. the CPU cleared the online bit in > >> the online mask is definitely too late simply because it still runs on > >> that outgoing CPU _after_ the hardware queue is shut down and drained. > > > > IMO, the patch in Christoph's blk-mq-hotplug.2 still works for percpu > > kthread. > > > > It is just not optimal in the retrying, but it should be fine. When the > > percpu kthread is scheduled on the CPU to be offlined: > > > > - if the kthread doesn't observe the INACTIVE flag, the allocated request > > will be drained. > > > > - otherwise, the kthread just retries and retries to allocate & release, > > and sooner or later, its time slice is consumed, and migrated out, and the > > cpu hotplug handler will get chance to run and move on, then the cpu is > > shutdown. > > 1) This is based on the assumption that the kthread is in the SCHED_OTHER > scheduling class. Is that really a valid assumption? Given it is unlikely path, we can add msleep() before retrying when INACTIVE bit is observed by current thread, and this way can avoid spinning and should work for other schedulers. > > 2) What happens in the following scenario: > > unplug > > mq_offline > set_ctx_inactive() > drain_io() > > io_kthread() > try_queue() > wait_on_ctx() > > Can this happen and if so what will wake up that thread? drain_io() releases all tag of this hctx, then wait_on_ctx() will be waken up after any tag is released. If wait_on_ctx() waits for other generic resource, it will be waken up after this resource is available. thanks, Ming