RE: [PATCH RFC v7 10/12] megaraid_sas: switch fusion adapters to MQ

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> > > I also noticed nr_hw_queues are now exposed in sysfs -
> > >
> > >
> /sys/devices/pci0000:85/0000:85:00.0/0000:86:00.0/0000:87:04.0/0000:8b
> > >
> >
> :00.0/0000:8c:00.0/0000:8d:00.0/host14/scsi_host/host14/nr_hw_queues:1
> > > 28
> > > .
> >
> > That's on my v8 wip branch, so I guess you're picking it up from there.
>
> John - I did more testing on v8 wip branch.  CPU hotplug is working as
> expected, but I still see some performance issue on Logical Volumes.
>
> I created 8 Drives Raid-0 VD on MR controller and below is performance
> impact of this RFC. Looks like contention is on single <sdev>.
>
> I used command - "numactl -N 1  fio 1vd.fio --iodepth=128 --bs=4k --
> rw=randread --cpus_allowed_policy=split --ioscheduler=none --
> group_reporting --runtime=200 --numjobs=1"
> IOPS without RFC = 300K IOPS with RFC = 230K.
>
> Perf top (shared host tag. IOPS = 230K)
>
> 13.98%  [kernel]        [k] sbitmap_any_bit_set
>      6.43%  [kernel]        [k] blk_mq_run_hw_queue

blk_mq_run_hw_queue function take more CPU which is called from "
scsi_end_request"
It looks like " blk_mq_hctx_has_pending" handles only elevator (scheduler)
case. If  queue has ioscheduler=none, we can skip. I case of scheduler=none,
IO will be pushed to hardware queue and it by pass software queue.
Based on above understanding, I added below patch and I can see performance
scale back to expectation.

Ming mentioned that - we cannot remove blk_mq_run_hw_queues() from IO
completion path otherwise we may see IO hang. So I have just modified
completion path assuming it is only required for IO scheduler case.
https://www.spinics.net/lists/linux-block/msg55049.html

Please review and let me know if this is good or we have to address with
proper fix.

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 1be7ac5a4040..b6a5b41b7fc2 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1559,6 +1559,9 @@ void blk_mq_run_hw_queues(struct request_queue *q,
bool async)
        struct blk_mq_hw_ctx *hctx;
        int i;

+       if (!q->elevator)
+               return;
+
        queue_for_each_hw_ctx(q, hctx, i) {
                if (blk_mq_hctx_stopped(hctx))
                        continue;

Kashyap

>      6.00%  [kernel]        [k] __audit_syscall_exit
>      3.47%  [kernel]        [k] read_tsc
>      3.19%  [megaraid_sas]  [k] complete_cmd_fusion
>      3.19%  [kernel]        [k] irq_entries_start
>      2.75%  [kernel]        [k] blk_mq_run_hw_queues
>      2.45%  fio             [.] fio_gettime
>      1.76%  [kernel]        [k] entry_SYSCALL_64
>      1.66%  [kernel]        [k] add_interrupt_randomness
>      1.48%  [megaraid_sas]  [k] megasas_build_and_issue_cmd_fusion
>      1.42%  [kernel]        [k] copy_user_generic_string
>      1.36%  [kernel]        [k] scsi_queue_rq
>      1.03%  [kernel]        [k] kmem_cache_alloc
>      1.03%  [kernel]        [k] internal_get_user_pages_fast
>      1.01%  [kernel]        [k] swapgs_restore_regs_and_return_to_usermode
>      0.96%  [kernel]        [k] kmem_cache_free
>      0.88%  [kernel]        [k] blkdev_direct_IO
>      0.84%  fio             [.] td_io_queue
>      0.83%  [kernel]        [k] __get_user_4
>
> Perf top (shared host tag. IOPS = 300K)
>
>     6.36%  [kernel]        [k] unroll_tree_refs
>      5.77%  [kernel]        [k] __do_softirq
>      4.56%  [kernel]        [k] irq_entries_start
>      4.38%  [kernel]        [k] read_tsc
>      3.95%  [megaraid_sas]  [k] complete_cmd_fusion
>      3.21%  fio             [.] fio_gettime
>      2.98%  [kernel]        [k] add_interrupt_randomness
>      1.79%  [kernel]        [k] entry_SYSCALL_64
>      1.61%  [kernel]        [k] copy_user_generic_string
>      1.61%  [megaraid_sas]  [k] megasas_build_and_issue_cmd_fusion
>      1.34%  [kernel]        [k] scsi_queue_rq
>      1.11%  [kernel]        [k] kmem_cache_free
>      1.05%  [kernel]        [k] blkdev_direct_IO
>      1.05%  [kernel]        [k] internal_get_user_pages_fast
>      1.00%  [kernel]        [k] __memset
>      1.00%  fio             [.] td_io_queue
>      0.98%  [kernel]        [k] kmem_cache_alloc
>      0.94%  [kernel]        [k] __get_user_4
>      0.93%  [kernel]        [k] lookup_ioctx
>      0.88%  [kernel]        [k] sbitmap_any_bit_set
>
> Kashyap
>
> >
> > Thanks,
> > John



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux