Re: [PATCH] block: Improve shared tag set performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

在 2023/10/19 2:00, Bart Van Assche 写道:
Remove the code for fair tag sharing because it significantly hurts
performance for UFS devices. Removing this code is safe because the
legacy block layer worked fine without any equivalent fairness
algorithm.

This algorithm hurts performance for UFS devices because UFS devices
have multiple logical units. One of these logical units (WLUN) is used
to submit control commands, e.g. START STOP UNIT. If any request is
submitted to the WLUN, the queue depth is reduced from 31 to 15 or
lower for data LUNs.

See also https://lore.kernel.org/linux-scsi/20221229030645.11558-1-ed.tsai@xxxxxxxxxxxx/

Note: it has been attempted to rework this algorithm. See also "[PATCH
RFC 0/7] blk-mq: improve tag fair sharing"
(https://lore.kernel.org/linux-block/20230618160738.54385-1-yukuai1@xxxxxxxxxxxxxxx/).
Given the complexity of that patch series, I do not expect that patch
series to be merged.

Sorry for such huge delay, I was struggled on implementing a smoothly
algorithm to borrow tags and return borrowed tags, and later I put this
on ice and focus on other stuff.

I had an idea to implement a state machine, however, the amount of code
was aggressive and I gave up. And later, I implemented a simple version,
and I tested it in your case, 32 tags and 2 shared node, result looks
good(see below), however, I'm not confident this can work well general.

Anyway, I'll send a new RFC verion for this, and please let me know if
you still think this approch is unacceptable.

Thanks,
Kuai

Test script:

[global]
ioengine=libaio
iodepth=2
bs=4k
direct=1
rw=randrw
group_reporting

[sda]
numjobs=32
filename=/dev/sda

[sdb]
numjobs=1
filename=/dev/sdb

Test result, by monitor new debugfs entry shared_tag_info:
time	active		available
	sda 	sdb	sda	sdb
0	0	0	32	32
1	16	2	16	16	-> start fair sharing
2	19	2	20	16
3	24	2	24	16
4	26	2	28	16 	-> borrow 32/8=4 tags each round
5	28	2	28	16	-> save at lease 4 tags for sdb
...

Cc: Christoph Hellwig <hch@xxxxxx>
Cc: Martin K. Petersen <martin.petersen@xxxxxxxxxx>
Cc: Ming Lei <ming.lei@xxxxxxxxxx>
Cc: Keith Busch <kbusch@xxxxxxxxxx>
Cc: Damien Le Moal <damien.lemoal@xxxxxxxxxxxxxxxxxx>
Cc: Yu Kuai <yukuai1@xxxxxxxxxxxxxxx>
Cc: Ed Tsai <ed.tsai@xxxxxxxxxxxx>
Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
---
  block/blk-mq-tag.c |  4 ----
  block/blk-mq.c     |  3 ---
  block/blk-mq.h     | 39 ---------------------------------------
  3 files changed, 46 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index cc57e2dd9a0b..25334bfcabf8 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -105,10 +105,6 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx)
  static int __blk_mq_get_tag(struct blk_mq_alloc_data *data,
  			    struct sbitmap_queue *bt)
  {
-	if (!data->q->elevator && !(data->flags & BLK_MQ_REQ_RESERVED) &&
-			!hctx_may_queue(data->hctx, bt))
-		return BLK_MQ_NO_TAG;
-
  	if (data->shallow_depth)
  		return sbitmap_queue_get_shallow(bt, data->shallow_depth);
  	else
diff --git a/block/blk-mq.c b/block/blk-mq.c
index e2d11183f62e..502dafa76716 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1760,9 +1760,6 @@ bool __blk_mq_alloc_driver_tag(struct request *rq)
  	if (blk_mq_tag_is_reserved(rq->mq_hctx->sched_tags, rq->internal_tag)) {
  		bt = &rq->mq_hctx->tags->breserved_tags;
  		tag_offset = 0;
-	} else {
-		if (!hctx_may_queue(rq->mq_hctx, bt))
-			return false;
  	}
tag = __sbitmap_queue_get(bt);
diff --git a/block/blk-mq.h b/block/blk-mq.h
index f75a9ecfebde..14a22f6d3fdf 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -407,45 +407,6 @@ static inline void blk_mq_free_requests(struct list_head *list)
  	}
  }
-/*
- * For shared tag users, we track the number of currently active users
- * and attempt to provide a fair share of the tag depth for each of them.
- */
-static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
-				  struct sbitmap_queue *bt)
-{
-	unsigned int depth, users;
-
-	if (!hctx || !(hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED))
-		return true;
-
-	/*
-	 * Don't try dividing an ant
-	 */
-	if (bt->sb.depth == 1)
-		return true;
-
-	if (blk_mq_is_shared_tags(hctx->flags)) {
-		struct request_queue *q = hctx->queue;
-
-		if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags))
-			return true;
-	} else {
-		if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state))
-			return true;
-	}
-
-	users = READ_ONCE(hctx->tags->active_queues);
-	if (!users)
-		return true;
-
-	/*
-	 * Allow at least some tags
-	 */
-	depth = max((bt->sb.depth + users - 1) / users, 4U);
-	return __blk_mq_active_requests(hctx) < depth;
-}
-
  /* run the code block in @dispatch_ops with rcu/srcu read lock held */
  #define __blk_mq_run_dispatch_ops(q, check_sleep, dispatch_ops)	\
  do {								\

.





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux