On 04/26/2017 08:37 PM, Bart Van Assche wrote: > If blk_get_request() fails check whether the failure is due to > a path being removed. If that is the case fail the path by > triggering a call to fail_path(). This patch avoids that the > following scenario can be encountered while removing paths: > * CPU usage of a kworker thread jumps to 100%. > * Removing the dm device becomes impossible. > > Delay requeueing if blk_get_request() returns -EBUSY or > -EWOULDBLOCK because in these cases immediate requeuing is > inappropriate. > > Signed-off-by: Bart Van Assche <bart.vanassche@xxxxxxxxxxx> > Cc: Hannes Reinecke <hare@xxxxxxxx> > Cc: Christoph Hellwig <hch@xxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> > --- > drivers/md/dm-mpath.c | 17 ++++++++++++----- > 1 file changed, 12 insertions(+), 5 deletions(-) > > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c > index 909098e18643..6d4333fdddf5 100644 > --- a/drivers/md/dm-mpath.c > +++ b/drivers/md/dm-mpath.c > @@ -490,6 +490,7 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq, > struct pgpath *pgpath; > struct block_device *bdev; > struct dm_mpath_io *mpio = get_mpio(map_context); > + struct request_queue *q; > struct request *clone; > > /* Do we need to select a new pgpath? */ > @@ -512,13 +513,19 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq, > mpio->nr_bytes = nr_bytes; > > bdev = pgpath->path.dev->bdev; > - > - clone = blk_get_request(bdev_get_queue(bdev), > - rq->cmd_flags | REQ_NOMERGE, > - GFP_ATOMIC); > + q = bdev_get_queue(bdev); > + clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_ATOMIC); > if (IS_ERR(clone)) { > /* EBUSY, ENODEV or EWOULDBLOCK: requeue */ > - return r; > + pr_debug("blk_get_request() returned %ld%s - requeuing\n", > + PTR_ERR(clone), blk_queue_dying(q) ? > + " (path offline)" : ""); > + if (blk_queue_dying(q)) { > + atomic_inc(&m->pg_init_in_progress); > + activate_path(pgpath); > + return DM_MAPIO_REQUEUE; > + } > + return DM_MAPIO_DELAY_REQUEUE; > } > clone->bio = clone->biotail = NULL; > clone->rq_disk = bdev->bd_disk; > At the very least this does warrant some inline comments. Why do we call activate_path() here, seeing that the queue is dying? Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@xxxxxxx +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)