On Thu, 2018-01-11 at 14:01 +0800, Ming Lei wrote: > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c > index 86bf502a8e51..fcddf5a62581 100644 > --- a/drivers/md/dm-mpath.c > +++ b/drivers/md/dm-mpath.c > @@ -533,8 +533,20 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq, > if (queue_dying) { > atomic_inc(&m->pg_init_in_progress); > activate_or_offline_path(pgpath); > + return DM_MAPIO_DELAY_REQUEUE; > } > - return DM_MAPIO_DELAY_REQUEUE; > + > + /* > + * blk-mq's SCHED_RESTART can cover this requeue, so > + * we needn't to deal with it by DELAY_REQUEUE. More > + * importantly, we have to return DM_MAPIO_REQUEUE > + * so that blk-mq can get the queue busy feedback, > + * otherwise I/O merge can be hurt. > + */ > + if (q->mq_ops) > + return DM_MAPIO_REQUEUE; > + else > + return DM_MAPIO_DELAY_REQUEUE; > } Sorry but the approach of this patch looks wrong to me. I'm afraid that this approach will cause 100% CPU consumption if the underlying .queue_rq() function returns BLK_STS_RESOURCE for another reason than what can be detected by the .get_budget() call. This can happen if e.g. a SCSI LLD .queuecommand() implementation returns SCSI_MLQUEUE_HOST_BUSY. Many SCSI LLDs can do this: $ git grep 'SCSI_MLQUEUE_[^_]*_BUSY' | wc -l 204 Isn't this a severe regression? Bart.