Re: [PATCH 1/2] block: Implement global tagset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 6 Apr 2017, 1:49am, Hannes Reinecke wrote:

> On 04/06/2017 08:27 AM, Arun Easi wrote:
> > Hi Hannes,
> > 
> > Thanks for taking a crack at the issue. My comments below..
> > 
> > On Tue, 4 Apr 2017, 5:07am, Hannes Reinecke wrote:
> > 
> >> Most legacy HBAs have a tagset per HBA, not per queue. To map
> >> these devices onto block-mq this patch implements a new tagset
> >> flag BLK_MQ_F_GLOBAL_TAGS, which will cause the tag allocator
> >> to use just one tagset for all hardware queues.
> >>
> >> Signed-off-by: Hannes Reinecke <hare@xxxxxxxx>
> >> ---
> >>  block/blk-mq-tag.c     | 12 ++++++++----
> >>  block/blk-mq.c         | 10 ++++++++--
> >>  include/linux/blk-mq.h |  1 +
> >>  3 files changed, 17 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> >> index e48bc2c..a14e76c 100644
> >> --- a/block/blk-mq-tag.c
> >> +++ b/block/blk-mq-tag.c
> >> @@ -276,9 +276,11 @@ static void blk_mq_all_tag_busy_iter(struct blk_mq_tags *tags,
> >>  void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
> >>  		busy_tag_iter_fn *fn, void *priv)
> >>  {
> >> -	int i;
> >> +	int i, lim = tagset->nr_hw_queues;
> >>  
> >> -	for (i = 0; i < tagset->nr_hw_queues; i++) {
> >> +	if (tagset->flags & BLK_MQ_F_GLOBAL_TAGS)
> >> +		lim = 1;
> >> +	for (i = 0; i < lim; i++) {
> >>  		if (tagset->tags && tagset->tags[i])
> >>  			blk_mq_all_tag_busy_iter(tagset->tags[i], fn, priv);
> >>  	}
> >> @@ -287,12 +289,14 @@ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
> >>  
> >>  int blk_mq_reinit_tagset(struct blk_mq_tag_set *set)
> >>  {
> >> -	int i, j, ret = 0;
> >> +	int i, j, ret = 0, lim = set->nr_hw_queues;
> >>  
> >>  	if (!set->ops->reinit_request)
> >>  		goto out;
> >>  
> >> -	for (i = 0; i < set->nr_hw_queues; i++) {
> >> +	if (set->flags & BLK_MQ_F_GLOBAL_TAGS)
> >> +		lim = 1;
> >> +	for (i = 0; i < lim; i++) {
> >>  		struct blk_mq_tags *tags = set->tags[i];
> >>  
> >>  		for (j = 0; j < tags->nr_tags; j++) {
> >> diff --git a/block/blk-mq.c b/block/blk-mq.c
> >> index 159187a..db96ed0 100644
> >> --- a/block/blk-mq.c
> >> +++ b/block/blk-mq.c
> >> @@ -2061,6 +2061,10 @@ static bool __blk_mq_alloc_rq_map(struct blk_mq_tag_set *set, int hctx_idx)
> >>  {
> >>  	int ret = 0;
> >>  
> >> +	if ((set->flags & BLK_MQ_F_GLOBAL_TAGS) && hctx_idx != 0) {
> >> +		set->tags[hctx_idx] = set->tags[0];
> >> +		return true;
> >> +	}
> > 
:
> 
> > BTW, if you would like me to try out this patch on my setup, please let me 
> > know.
> > 
> Oh, yes. Please do.
> 

Ran the tests on my setup (Dell R730, 2 Node). This change did not drop 
any IOPs (got ~2M 512b). The cache miss percentage was varying based on if 
the tests were running on one node or both (latter yperformed worse). All 
interrupts were directed to only 1 node. Interestingly, the cache miss 
percentage was lowest when MQ was off.

I hit a fdisk hang (open path), btw, not sure if it has anything todo with 
this change, though.

Notes and hang stack attached.

Let me know if you are interested in any specific perf event/command-line.

Regards,
-Arun
perf stat, ran on a short 10 second load.

---1port-1node-new-mq----
 Performance counter stats for 'CPU(s) 2':

 188,642,696      LLC-loads                                            (66.66%)
   3,615,142      LLC-load-misses  #    1.92% of all LL-cache hits     (66.67%)
  86,488,341      LLC-stores                                           (33.34%)
  10,820,977      LLC-store-misses                                     (33.33%)
 391,370,104      cache-references                                     (49.99%)
  14,498,491      cache-misses     #    3.705 % of all cache refs      (66.66%)

---1port-1node-mq---
 Performance counter stats for 'CPU(s) 2':

 145,025,999      LLC-loads                                            (66.67%)
   3,793,427      LLC-load-misses  #    2.62% of all LL-cache hits     (66.67%)
  60,878,939      LLC-stores                                           (33.33%)
   8,044,714      LLC-store-misses                                     (33.33%)
 294,713,070      cache-references                                     (50.00%)
  11,923,354      cache-misses     #    4.046 % of all cache refs      (66.66%)

---1port-1node-nomq---
 Performance counter stats for 'CPU(s) 2':

 157,375,709      LLC-loads                                            (66.66%)
     476,117      LLC-load-misses  #    0.30% of all LL-cache hits     (66.66%)
  76,046,098      LLC-stores                                           (33.34%)
     840,756      LLC-store-misses                                     (33.34%)
 326,230,969      cache-references                                     (50.00%)
   1,332,398      cache-misses     #    0.408 % of all cache refs      (66.67%)

======================

--2port-allnodes-new-mq--
 Performance counter stats for 'CPU(s) 2':

  55,455,533      LLC-loads                                            (66.67%)
  37,996,545      LLC-load-misses  #   68.52% of all LL-cache hits     (66.67%)
  14,030,291      LLC-stores                                           (33.33%)
   7,096,931      LLC-store-misses                                     (33.33%)
  76,711,197      cache-references                                     (49.99%)
  45,170,719      cache-misses     #   58.884 % of all cache refs      (66.66%)

--2port-allnodes-mq--
 Performance counter stats for 'CPU(s) 2':

  59,303,410      LLC-loads                                            (66.66%)
  31,115,601      LLC-load-misses  #   52.47% of all LL-cache hits     (66.66%)
  17,496,477      LLC-stores                                           (33.34%)
   6,201,373      LLC-store-misses                                     (33.34%)
  89,035,272      cache-references                                     (50.00%)
  37,372,777      cache-misses     #   41.975 % of all cache refs      (66.66%)

--2port-allnodes-nomq--
 Performance counter stats for 'CPU(s) 2':

  86,724,905      LLC-loads                                            (66.67%)
  27,154,245      LLC-load-misses  #   31.31% of all LL-cache hits     (66.67%)
  33,710,265      LLC-stores                                           (33.34%)
   6,521,394      LLC-store-misses                                     (33.33%)
 139,089,528      cache-references                                     (50.00%)
  33,682,000      cache-misses     #   24.216 % of all cache refs      (66.66%)

Apr  6 17:34:05 avlnxperf kernel: INFO: task fdisk:27745 blocked for more than 120 seconds.
Apr  6 17:34:05 avlnxperf kernel:      Tainted: G    B      OE   4.11.0-rc4-newblk-ae+ #4
Apr  6 17:34:05 avlnxperf kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr  6 17:34:05 avlnxperf kernel: fdisk           D    0 27745  27743 0x00000080
Apr  6 17:34:05 avlnxperf kernel: Call Trace:
Apr  6 17:34:05 avlnxperf kernel: __schedule+0x289/0x8f0
Apr  6 17:34:05 avlnxperf kernel: schedule+0x36/0x80
Apr  6 17:34:05 avlnxperf kernel: schedule_timeout+0x249/0x300
Apr  6 17:34:05 avlnxperf kernel: ? sched_clock_cpu+0x11/0xb0
Apr  6 17:34:05 avlnxperf kernel: ? try_to_wake_up+0x59/0x450
Apr  6 17:34:05 avlnxperf kernel: wait_for_completion+0x121/0x180
Apr  6 17:34:05 avlnxperf kernel: ? wake_up_q+0x80/0x80
Apr  6 17:34:05 avlnxperf kernel: flush_work+0x11d/0x1c0
Apr  6 17:34:05 avlnxperf kernel: ? wake_up_worker+0x30/0x30
Apr  6 17:34:05 avlnxperf kernel: __cancel_work_timer+0x10e/0x1d0
Apr  6 17:34:05 avlnxperf kernel: ? kobj_lookup+0x10d/0x160
Apr  6 17:34:05 avlnxperf kernel: cancel_delayed_work_sync+0x13/0x20
Apr  6 17:34:05 avlnxperf kernel: disk_block_events+0x77/0x80
Apr  6 17:34:05 avlnxperf kernel: __blkdev_get+0x11b/0x4b0
Apr  6 17:34:05 avlnxperf kernel: blkdev_get+0x1c3/0x320
Apr  6 17:34:05 avlnxperf kernel: blkdev_open+0x5b/0x70
Apr  6 17:34:05 avlnxperf kernel: do_dentry_open+0x213/0x330
Apr  6 17:34:05 avlnxperf kernel: ? bd_acquire+0xd0/0xd0
Apr  6 17:34:05 avlnxperf kernel: vfs_open+0x4f/0x70
Apr  6 17:34:05 avlnxperf kernel: ? may_open+0x9b/0x100
Apr  6 17:34:05 avlnxperf kernel: path_openat+0x557/0x13c0
Apr  6 17:34:05 avlnxperf kernel: ? generic_file_read_iter+0x746/0x8c0
Apr  6 17:34:05 avlnxperf kernel: ? scsi_bios_ptable+0x54/0x130
Apr  6 17:34:05 avlnxperf kernel: do_filp_open+0x91/0x100
Apr  6 17:34:05 avlnxperf kernel: ? __alloc_fd+0x46/0x170
Apr  6 17:34:05 avlnxperf kernel: do_sys_open+0x124/0x210
Apr  6 17:34:05 avlnxperf kernel: ? __audit_syscall_exit+0x209/0x290
Apr  6 17:34:05 avlnxperf kernel: SyS_open+0x1e/0x20
Apr  6 17:34:05 avlnxperf kernel: do_syscall_64+0x67/0x180
Apr  6 17:34:05 avlnxperf kernel: entry_SYSCALL64_slow_path+0x25/0x25
Apr  6 17:34:05 avlnxperf kernel: RIP: 0033:0x7faef86b0a10
Apr  6 17:34:05 avlnxperf kernel: RSP: 002b:00007fffa7159438 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
Apr  6 17:34:05 avlnxperf kernel: RAX: ffffffffffffffda RBX: 0000000000b34310 RCX: 00007faef86b0a10
Apr  6 17:34:05 avlnxperf kernel: RDX: 00007fffa7159598 RSI: 0000000000080000 RDI: 00007fffa7159590
Apr  6 17:34:05 avlnxperf kernel: RBP: 00007fffa7159590 R08: 00007faef8610938 R09: 0000000000000008
Apr  6 17:34:05 avlnxperf kernel: R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000
Apr  6 17:34:05 avlnxperf kernel: R13: 0000000000b34550 R14: 0000000000000005 R15: 0000000000000000

[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux