Re: [PATCH] blk-mq: don't fail driver tag allocation because of inactive hctx

Ming Lei <ming.lei@xxxxxxxxxx> · Fri, 5 Jun 2020 16:33:49 +0800

On Thu, Jun 04, 2020 at 01:45:09PM +0100, John Garry wrote:
> 
> > > That's your patch - ok, I can try.
> > > 
> 
> I still get timeouts and sometimes the same driver tag message occurs:
> 
>  1014.232417] run queue from wrong CPU 0, hctx active
> [ 1014.237692] run queue from wrong CPU 0, hctx active
> [ 1014.243014] run queue from wrong CPU 0, hctx active
> [ 1014.248370] run queue from wrong CPU 0, hctx active
> [ 1014.253725] run queue from wrong CPU 0, hctx active
> [ 1014.259252] run queue from wrong CPU 0, hctx active
> [ 1014.264492] run queue from wrong CPU 0, hctx active
> [ 1014.269453] irq_shutdown irq146
> [ 1014.272752] CPU55: shutdown
> [ 1014.275552] psci: CPU55 killed (polled 0 ms)
> [ 1015.151530] CPU56: shutdownr=1621MiB/s,w=0KiB/s][r=415k,w=0 IOPS][eta
> 00m:00s]
> [ 1015.154322] psci: CPU56 killed (polled 0 ms)
> [ 1015.184345] CPU57: shutdown
> [ 1015.187143] psci: CPU57 killed (polled 0 ms)
> [ 1015.223388] CPU58: shutdown
> [ 1015.226174] psci: CPU58 killed (polled 0 ms)
> long sleep 8
> [ 1045.234781] scsi_times_out req=0xffff041fa13e6300[r=0,w=0 IOPS][eta
> 04m:30s]
> 
> [...]
> 
> > > 
> > > I thought that if all the sched tags are put, then we should have no driver
> > > tag for that same hctx, right? That seems to coincide with the timeout (30
> > > seconds later)
> > 
> > That is weird, if there is driver tag found, that means the request is
> > in-flight and can't be completed by HW.
> 
> In blk_mq_hctx_has_requests(), we iterate the sched tags (when
> hctx->sched_tags is set). So can some requests not have a sched tag (even
> for scheduler set for the queue)?
> 
>  I assume you have integrated
> > global host tags patch in your test,
> 
> No, but the LLDD does not use request->tag - it generates its own.
> 
>  and suggest you to double check
> > hisi_sas's queue mapping which has to be exactly same with blk-mq's
> > mapping.
> > 
> 
> scheduler=none is ok, so I am skeptical of a problem there.

Please try the following patch, and we may not drain in-flight
requests correctly:

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 97bb650f0ed6..ae110e2754bf 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -265,6 +265,7 @@ struct bt_tags_iter_data {
 
 #define BT_TAG_ITER_RESERVED		(1 << 0)
 #define BT_TAG_ITER_STARTED		(1 << 1)
+#define BT_TAG_ITER_STATIC_RQS		(1 << 2)
 
 static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
 {
@@ -280,7 +281,10 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
 	 * We can hit rq == NULL here, because the tagging functions
 	 * test and set the bit before assining ->rqs[].
 	 */
-	rq = tags->rqs[bitnr];
+	if (iter_data->flags & BT_TAG_ITER_STATIC_RQS)
+		rq = tags->static_rqs[bitnr];
+	else
+		rq = tags->rqs[bitnr];
 	if (!rq)
 		return true;
 	if ((iter_data->flags & BT_TAG_ITER_STARTED) &&
@@ -335,11 +339,13 @@ static void __blk_mq_all_tag_iter(struct blk_mq_tags *tags,
  *		indicates whether or not @rq is a reserved request. Return
  *		true to continue iterating tags, false to stop.
  * @priv:	Will be passed as second argument to @fn.
+ *
+ * Caller has to pass the tag map from which requests are allocated.
  */
 void blk_mq_all_tag_iter(struct blk_mq_tags *tags, busy_tag_iter_fn *fn,
 		void *priv)
 {
-	return __blk_mq_all_tag_iter(tags, fn, priv, 0);
+	return __blk_mq_all_tag_iter(tags, fn, priv, BT_TAG_ITER_STATIC_RQS);
 }
 
 /**


Thanks,
Ming