Hi Kashyap, Thanks for your test. On Sun, Jun 02, 2019 at 03:11:26AM +0530, Kashyap Desai wrote: > > SCSI's reply qeueue is very similar with blk-mq's hw queue, both > assigned by > > IRQ vector, so map te private reply queue into blk-mq's hw queue via > > .host_tagset. > > > > Then the private reply mapping can be removed. > > > > Another benefit is that the request/irq lost issue may be solved in > generic > > approach because managed IRQ may be shutdown during CPU hotplug. > > Ming, > > I quickly tested this patch series on MegaRaid Aero controller. Without > this patch I can get 3.0M IOPS, but once I apply this patch I see only > 1.2M IOPS (40% Performance drop) > HBA supports 5089 can_queue. > > <perf top> output without patch - > > 3.39% [megaraid_sas] [k] complete_cmd_fusion > 3.36% [kernel] [k] scsi_queue_rq > 3.26% [kernel] [k] entry_SYSCALL_64 > 2.57% [kernel] [k] syscall_return_via_sysret > 1.95% [megaraid_sas] [k] megasas_build_and_issue_cmd_fusion > 1.88% [kernel] [k] _raw_spin_lock_irqsave > 1.79% [kernel] [k] gup_pmd_range > 1.73% [kernel] [k] _raw_spin_lock > 1.68% [kernel] [k] __sched_text_start > 1.19% [kernel] [k] irq_entries_start > 1.13% [kernel] [k] scsi_dec_host_busy > 1.08% [kernel] [k] aio_complete > 1.07% [kernel] [k] read_tsc > 1.01% [kernel] [k] blk_mq_get_request > 0.93% [kernel] [k] __update_load_avg_cfs_rq > 0.92% [kernel] [k] aio_read_events > 0.91% [kernel] [k] lookup_ioctx > 0.91% fio [.] fio_gettime > 0.87% [kernel] [k] set_next_entity > 0.87% [megaraid_sas] [k] megasas_build_ldio_fusion > > <perf top> output with patch - > > 11.30% [kernel] [k] native_queued_spin_lock_slowpath I guess there must be one global lock required in megaraid submission path, could you run 'perf record -g -a' to see which lock is and what the stack trace is? Thanks, Ming