Beste Jack,
I set the queue_depth to 1 and timeout to 300 for all SATA disk
connected to the mvsas controller [ARC-1300ix-16].
Does this mean that ata21 is mapped to /dev/sdq!
root@sweeney:~# dmesg | grep ata21 | grep device
[ 4.788568] sas: ata21: end_device-0:0:26: dev error handler
root@sweeney:~# lsscsi -v | grep end_device-0:0:26
dir: /sys/bus/scsi/devices/0:0:14:0
[/sys/devices/pci0000:00/0000:00:1c.0/0000:08:00.0/host0/port-0:0/expander-0:0/port-0:0:26/end_device-0:0:26/target0:0:14/0:0:14:0]
root@sweeney:~# lsscsi -v | grep 0:0:14:0
[0:0:14:0] disk ATA WDC WD1003FBYX-0 1V01 /dev/sdq
dir: /sys/bus/scsi/devices/0:0:14:0
[/sys/devices/pci0000:00/0000:00:1c.0/0000:08:00.0/host0/port-0:0/expander-0:0/port-0:0:26/end_device-0:0:26/target0:0:14/0:0:14:0]
I added the following to my rc.local
vim /etc/rc.local
for disk in sd{c..r}; do
echo deadline > /sys/block/$disk/queue/scheduler
echo 0 > /sys/block/$disk/queue/iosched/front_merges
echo 150 > /sys/block/$disk/queue/iosched/read_expire
echo 1500 > /sys/block/$disk/queue/iosched/write_expire
echo 1 > /sys/block/$disk/device/queue_depth;
echo 300 > /sys/block/$disk/device/timeout;
done
I hope the performance impact of queue_depth = 1 is not to much....
Kind regards,
Jelle de Jong
On 26/01/17 11:17, Jack Wang wrote:
2017-01-26 10:51 GMT+01:00 Jelle de Jong <jelledejong@xxxxxxxxxxxxx>:
Hello everybody,
I got a server that seemingly random gets kernel crashes, due to an
escalation of events from most likely the mvsas based disk controller.
The harddisk should be okay, I replaced a whole bunch to be sure, but the
server does not get stable. I can not seem to figure out how to map for
example ata21.00 to an disk so I can do a deep badblock check.
I have to complete boot with kernel crash log saved with additional
information.
http://paste.debian.net/plainh/96325d89
Can somebody take a look and maybe help?
08:00.0 SCSI storage controller: Areca Technology Corp. ARC-1300ix-16
16-Port PCI-Express to SAS Non-RAID Host Adapter (rev 02)
Kind regards,
Jelle de Jong
Your IO error seems related to NCQ, have you tried to disable NCQ?
echo 1 > /sys/block/sdX/device/queue_depth
Maybe try 4.10-rc5 is also a option?
Regards,
Jack
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html