On 22/02/2021 14:23, Roger Willcocks wrote:
FYI we have exactly this issue on a machine here running CentOS 8.3 (kernel 4.18.0-240.1.1) (so presumably this happens in RHEL 8 too.)
Controller is MSCC / Adaptec 3154-8i16e driving 60 x 12TB HGST drives configured as five x twelve-drive raid-6, software striped using md, and formatted with xfs.
Test software writes to the array using multiple threads in parallel.
The smartpqi driver would report controller offline within ten minutes or so, with status code 0x6100c
Changed the driver to set 'nr_hw_queues = 1’ and then tested by filling the array with random files (which took a couple of days), which completed fine, so it looks like that one-line change fixes it.
That just makes the driver single-queue.
As such, since the driver uses blk_mq_unique_tag_to_hwq(), only hw queue
#0 will ever be used in the driver.
And then, since the driver still spreads MSI-X interrupt vectors over
all CPUs [from pci_alloc_vectors(PCI_IRQ_AFFINITY)], if CPUs associated
with HW queue #0 are offlined (probably just cpu0), there is no CPUs
available to service queue #0 interrupt. That's what I think would
happen, from a quick glance at the code.
Would, of course, be helpful if this was back-ported.
—
Roger
On 3 Feb 2021, at 15:56, Don.Brace@xxxxxxxxxxxxx wrote:
-----Original Message-----
From: Martin Wilck [mailto:mwilck@xxxxxxxx]
Subject: Re: [PATCH] scsi: scsi_host_queue_ready: increase busy count early
Confirmed my suspicions - it looks like the host is sent more commands
than it can handle. We would need many disks to see this issue though,
which you have.
So for stable kernels, 6eb045e092ef is not in 5.4 . Next is 5.10, and
I suppose it could be simply fixed by setting .host_tagset in scsi
host template there.
Thanks,
John
--
Don: Even though this works for current kernels, what would chances of
this getting back-ported to 5.9 or even further?
Otherwise the original patch smartpqi_fix_host_qdepth_limit would
correct this issue for older kernels.
True. However this is 5.12 material, so we shouldn't be bothered by that here. For 5.5 up to 5.9, you need a workaround. But I'm unsure whether smartpqi_fix_host_qdepth_limit would be the solution.
You could simply divide can_queue by nr_hw_queues, as suggested before, or even simpler, set nr_hw_queues = 1.
How much performance would that cost you?
Don: For my HBA disk tests...
Dividing can_queue / nr_hw_queues is about a 40% drop.
~380K - 400K IOPS
Setting nr_hw_queues = 1 results in a 1.5 X gain in performance.
~980K IOPS
Setting host_tagset = 1
~640K IOPS
So, it seem that setting nr_hw_queues = 1 results in the best performance.
Is this expected? Would this also be true for the future?
Thanks,
Don Brace
Below is my setup.
---
[3:0:0:0] disk HP EG0900FBLSK HPD7 /dev/sdd
[3:0:1:0] disk HP EG0900FBLSK HPD7 /dev/sde
[3:0:2:0] disk HP EG0900FBLSK HPD7 /dev/sdf
[3:0:3:0] disk HP EH0300FBQDD HPD5 /dev/sdg
[3:0:4:0] disk HP EG0900FDJYR HPD4 /dev/sdh
[3:0:5:0] disk HP EG0300FCVBF HPD9 /dev/sdi
[3:0:6:0] disk HP EG0900FBLSK HPD7 /dev/sdj
[3:0:7:0] disk HP EG0900FBLSK HPD7 /dev/sdk
[3:0:8:0] disk HP EG0900FBLSK HPD7 /dev/sdl
[3:0:9:0] disk HP MO0200FBRWB HPD9 /dev/sdm
[3:0:10:0] disk HP MM0500FBFVQ HPD8 /dev/sdn
[3:0:11:0] disk ATA MM0500GBKAK HPGC /dev/sdo
[3:0:12:0] disk HP EG0900FBVFQ HPDC /dev/sdp
[3:0:13:0] disk HP VO006400JWZJT HP00 /dev/sdq
[3:0:14:0] disk HP VO015360JWZJN HP00 /dev/sdr
[3:0:15:0] enclosu HP D3700 5.04 -
[3:0:16:0] enclosu HP D3700 5.04 -
[3:0:17:0] enclosu HPE Smart Adapter 3.00 -
[3:1:0:0] disk HPE LOGICAL VOLUME 3.00 /dev/sds
[3:2:0:0] storage HPE P408e-p SR Gen10 3.00 -
-----
[global]
ioengine=libaio
; rw=randwrite
; percentage_random=40
rw=write
size=100g
bs=4k
direct=1
ramp_time=15
; filename=/mnt/fio_test
; cpus_allowed=0-27
iodepth=4096
[/dev/sdd]
[/dev/sde]
[/dev/sdf]
[/dev/sdg]
[/dev/sdh]
[/dev/sdi]
[/dev/sdj]
[/dev/sdk]
[/dev/sdl]
[/dev/sdm]
[/dev/sdn]
[/dev/sdo]
[/dev/sdp]
[/dev/sdq]
[/dev/sdr]
Distribution kernels would be yet another issue, distros can backport host_tagset and get rid of the issue.
Regards
Martin
.