On Fri, Aug 18, 2023 at 10:01 PM Ming Lei <ming.lei@xxxxxxxxxx> wrote: > > group_cpus_evenly() could be part of storage driver's error handler, > such as nvme driver, when may happen during CPU hotplug, in which > storage queue has to drain its pending IOs because all CPUs associated > with the queue are offline and the queue is becoming inactive. And > handling IO needs error handler to provide forward progress. > > Then dead lock is caused: > > 1) inside CPU hotplug handler, CPU hotplug lock is held, and blk-mq's > handler is waiting for inflight IO > > 2) error handler is waiting for CPU hotplug lock > > 3) inflight IO can't be completed in blk-mq's CPU hotplug handler because > error handling can't provide forward progress. > > Solve the deadlock by not holding CPU hotplug lock in group_cpus_evenly(), > in which two stage spreads are taken: 1) the 1st stage is over all present > CPUs; 2) the end stage is over all other CPUs. > > Turns out the two stage spread just needs consistent 'cpu_present_mask', and > remove the CPU hotplug lock by storing it into one local cache. This way > doesn't change correctness, because all CPUs are still covered. > > Cc: Keith Busch <kbusch@xxxxxxxxxx> > Cc: linux-nvme@xxxxxxxxxxxxxxxxxxx > Cc: linux-block@xxxxxxxxxxxxxxx > Reported-by: Yi Zhang <yi.zhang@xxxxxxxxxx> > Reported-by: Guangwu Zhang <guazhang@xxxxxxxxxx> > Tested-by: Guangwu Zhang <guazhang@xxxxxxxxxx> > Reviewed-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > --- > V3: > - reuse `npresmsk`, and avoid to allocate new variable, suggested by > Chengming Zhou Hello Thomas and Jens, Ping... Thanks, Ming