[PATCH v2] block/dm: fix handling of busy off direct dispatch path

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



After the direct dispatch corruption fix, we permanently disallow direct
dispatch of non read/write requests. This works fine off the normal IO
path, as they will be retried like any other failed direct dispatch
request. But for the blk_insert_cloned_request() that only DM uses to
bypass the bottom level scheduler, we always first attempt direct
dispatch. For some types of requests, that's now a permanent failure,
and no amount of retrying will make that succeed.

Use the driver private RQF_DONTPREP to track this condition in DM. If
we encounter a BUSY condition from blk_insert_cloned_request(), then
flag the request with RQF_DONTPREP. When we next time see this request,
ask blk_insert_cloned_request() to bypass insert the request directly.
This avoids the livelock of repeatedly trying to direct dispatch a
request, while still retaining the BUSY feedback loop for blk-mq so
that we don't over-dispatch to the lower level queue and mess up
opportunities for merging on the DM queue.

Fixes: ffe81d45322c ("blk-mq: fix corruption with direct issue")
Reported-by: Bart Van Assche <bvanassche@xxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>

---

This passes my testing as well, like the previous patch. But unlike the
previous patch, we retain the BUSY feedback loop information for better
merging.

diff --git a/block/blk-core.c b/block/blk-core.c
index deb56932f8c4..cccda51e165f 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2617,7 +2617,8 @@ static int blk_cloned_rq_check_limits(struct request_queue *q,
  * @q:  the queue to submit the request
  * @rq: the request being queued
  */
-blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *rq)
+blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *rq,
+				       bool force_insert)
 {
 	unsigned long flags;
 	int where = ELEVATOR_INSERT_BACK;
@@ -2637,7 +2638,11 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *
 		 * bypass a potential scheduler on the bottom device for
 		 * insert.
 		 */
-		return blk_mq_request_issue_directly(rq);
+		if (force_insert) {
+			blk_mq_request_bypass_insert(rq, true);
+			return BLK_STS_OK;
+		} else
+			return blk_mq_request_issue_directly(rq);
 	}
 
 	spin_lock_irqsave(q->queue_lock, flags);
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 7cd36e4d1310..e497a2ab6766 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -299,16 +299,20 @@ static void end_clone_request(struct request *clone, blk_status_t error)
 
 static blk_status_t dm_dispatch_clone_request(struct request *clone, struct request *rq)
 {
+	bool was_busy = (rq->rq_flags & RQF_DONTPREP) != 0;
 	blk_status_t r;
 
 	if (blk_queue_io_stat(clone->q))
 		clone->rq_flags |= RQF_IO_STAT;
 
 	clone->start_time_ns = ktime_get_ns();
-	r = blk_insert_cloned_request(clone->q, clone);
-	if (r != BLK_STS_OK && r != BLK_STS_RESOURCE && r != BLK_STS_DEV_RESOURCE)
+	r = blk_insert_cloned_request(clone->q, clone, was_busy);
+	if (r == BLK_STS_RESOURCE || r == BLK_STS_DEV_RESOURCE)
+		rq->rq_flags |= RQF_DONTPREP;
+	else if (r != BLK_STS_OK)
 		/* must complete clone in terms of original request */
 		dm_complete_request(rq, r);
+
 	return r;
 }
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 4293dc1cd160..7cb84ee4c9f4 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -994,7 +994,7 @@ extern int blk_rq_prep_clone(struct request *rq, struct request *rq_src,
 			     void *data);
 extern void blk_rq_unprep_clone(struct request *rq);
 extern blk_status_t blk_insert_cloned_request(struct request_queue *q,
-				     struct request *rq);
+				     struct request *rq, bool force_insert);
 extern int blk_rq_append_bio(struct request *rq, struct bio **bio);
 extern void blk_delay_queue(struct request_queue *, unsigned long);
 extern void blk_queue_split(struct request_queue *, struct bio **);

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux