On Tue, Jan 12, 2021 at 08:56:45AM +0000, John Garry wrote: > Hi Ming, > > > > > > > I was looking at some IOMMU issue on a LSI RAID 3008 card, and noticed that > > > performance there is not what I get on other SAS HBAs - it's lower. > > > > > > After some debugging and fiddling with sdev queue depth in mpt3sas driver, I > > > am finding that performance changes appreciably with sdev queue depth: > > > > > > sdev qdepth fio number jobs* 1 10 20 > > > 16 1590 1654 1660 > > > 32 1545 1646 1654 > > > 64 1436 1085 1070 > > > 254 (default) 1436 1070 1050 > > > > What does the performance number mean? IOPS or others? What is the fio > > io test? random IO or sequential IO? > > So those figures are x1K IOPs read performance; so 1590, above, is 1.59M > IOPs read. Here's the fio script: > > [global] > rw=read > direct=1 > ioengine=libaio > iodepth=40 > numjobs=20 > bs=4k > ;size=10240000m > ;zero_buffers=1 > group_reporting=1 > ;ioscheduler=noop > ;cpumask=0xffe > ;cpus_allowed=1-47 > ;gtod_reduce=1 > ;iodepth_batch=2 > ;iodepth_batch_complete=2 > runtime=60 > ;thread > loops = 10000 Is there any effect on random read IOPS when you decrease sdev queue depth? For sequential IO, IO merge can be enhanced by that way. > > > > > > > fio queue depth is 40, and I'm using 12x SAS SSDs. > > > > > > I got comparable disparity in results for fio queue depth = 128 and num jobs > > > = 1: > > > > > > sdev qdepth fio number jobs* 1 > > > 16 1640 > > > 32 1618 > > > 64 1577 > > > 254 (default) 1437 > > > > > > IO sched = none. > > > > > > That driver also sets queue depth tracking = 1, but never seems to kick in. > > > > > > So it seems to me that the block layer is merging more bios per request, as > > > averge sg count per request goes up from 1 - > upto 6 or more. As I see, > > > when queue depth lowers the only thing that is really changing is that we > > > fail more often in getting the budget in > > > scsi_mq_get_budget()->scsi_dev_queue_ready(). > > > > Right, the behavior basically doesn't change compared with block legacy > > io path. And that is why sdev->queue_depth is a bit important for HDD. > > OK > > > > > > > > > So initial sdev queue depth comes from cmd_per_lun by default or manually > > > setting in the driver via scsi_change_queue_depth(). It seems to me that > > > some drivers are not setting this optimally, as above. > > > > > > Thoughts on guidance for setting sdev queue depth? Could blk-mq changed this > > > behavior? > > > > So far, the sdev queue depth is provided by SCSI layer, and blk-mq can > > queue one request only if budget is obtained via .get_budget(). > > > > Well, based on my testing, default sdev queue depth seems too large for that > LLDD ... Yeah, it is similar with NVMe since people often cares latency more for SSD. -- Ming