Hi, I have a problem with drives in my RAID. Some of drives are getting disconnected whenever I am trying to write significant amount of data to the array. My problematic RAID6 consisting of 7 drives: * 2 x WDC WD15EADS-22P8B0 * 5 x SAMSUNG HD154UI The drives are connected through HP SAS expander to LSI SAS 9201-16i. The raid device is encrypted using cryptsetup (cryptsetup --cipher aes-xts-plain64 --key-size 256 --key-file ./keyN.bin open --type plain /dev/mdN cN) and the filesystem I am using on top of it is ext4. I am running Debian jessie with linux kernel from backports (4.9.0-1-amd64 #1 SMP Debian 4.9.6-3 (2017-01-28) x86_64 GNU/Linux). The filesystem on the array was mostly full (500 GB free out of 7500 GB) and the data was an archived data for which I have a backup. They array worked fine for reading but whenever I tried writing to the filesystem any significant amount of data (more than 20GB - 100GB) the drives got dropped from the array. I thought that one of the disks might be failing so I ran badblocks (badblocks -wsv /dev/sdX) on each of the drives (simultaneously for all of them) and none of the drives reported any error. I check the S.M.A.R.T. reports and they don't look bad either. I have recorded kernel messages during one of the incidents: [22148.047650] mpt2sas_cm0: log_info(0x31120b10): originator(PL), code(0x12), sub_code(0x0b10) (Previous message repeated multiple times...) [22159.797403] mpt2sas_cm0: log_info(0x31120b10): originator(PL), code(0x12), sub_code(0x0b10) [22175.047239] mpt2sas_cm0: log_info(0x31111000): originator(PL), code(0x11), sub_code(0x1000) (Previous message repeated multiple times...) [22273.046355] mpt2sas_cm0: log_info(0x31111000): originator(PL), code(0x11), sub_code(0x1000) [22273.046437] sd 0:0:6:0: [sdg] tag#2 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK [22273.046441] sd 0:0:6:0: [sdg] tag#2 CDB: Write(10) 2a 00 86 93 fa e8 00 04 00 00 [22273.046448] blk_update_request: I/O error, dev sdg, sector 2257844968 [22273.046713] sd 0:0:6:0: [sdg] tag#1 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK [22273.046715] sd 0:0:6:0: [sdg] tag#1 CDB: Write(10) 2a 00 86 93 f6 e8 00 04 00 00 [22273.046717] blk_update_request: I/O error, dev sdg, sector 2257843944 [22273.047830] sd 0:0:6:0: [sdg] tag#19 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK [22273.047832] sd 0:0:6:0: [sdg] tag#19 CDB: Write(10) 2a 00 86 93 fe e8 00 04 00 00 [22273.047833] blk_update_request: I/O error, dev sdg, sector 2257845992 [22297.545811] mpt2sas_cm0: log_info(0x31111000): originator(PL), code(0x11), sub_code(0x1000) (Previous message repeated multiple times...) [22297.545874] mpt2sas_cm0: log_info(0x31111000): originator(PL), code(0x11), sub_code(0x1000) [22297.545883] sd 0:0:6:0: [sdg] tag#24 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK [22297.545890] mpt2sas_cm0: log_info(0x31111000): originator(PL), code(0x11), sub_code(0x1000) [22297.545893] sd 0:0:6:0: [sdg] tag#24 CDB: Write(10) 2a 00 86 94 12 e8 00 04 00 00 [22297.545896] mpt2sas_cm0: log_info(0x31111000): originator(PL), code(0x11), sub_code(0x1000) [22297.545898] blk_update_request: I/O error, dev sdg, sector 2257851112 [22297.545905] mpt2sas_cm0: log_info(0x31111000): originator(PL), code(0x11), sub_code(0x1000) (Previous message repeated multiple times...) [22297.546029] mpt2sas_cm0: log_info(0x31111000): originator(PL), code(0x11), sub_code(0x1000) [22297.546062] sd 0:0:6:0: [sdg] tag#16 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK [22297.546066] sd 0:0:6:0: [sdg] tag#16 CDB: Write(10) 2a 00 86 94 02 e8 00 04 00 00 [22297.546069] blk_update_request: I/O error, dev sdg, sector 2257847016 [22297.546070] sd 0:0:6:0: [sdg] tag#23 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK [22297.546073] sd 0:0:6:0: [sdg] tag#23 CDB: Write(10) 2a 00 86 94 0e e8 00 04 00 00 [22297.546074] blk_update_request: I/O error, dev sdg, sector 2257850088 [22297.546160] sd 0:0:6:0: [sdg] tag#22 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK [22297.546162] sd 0:0:6:0: [sdg] tag#22 CDB: Write(10) 2a 00 86 94 0a e8 00 04 00 00 [22297.546163] blk_update_request: I/O error, dev sdg, sector 2257849064 [22297.546272] sd 0:0:6:0: [sdg] tag#19 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK [22297.546274] sd 0:0:6:0: [sdg] tag#19 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00 [22297.546277] sd 0:0:6:0: [sdg] tag#21 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK [22297.546280] blk_update_request: I/O error, dev sdg, sector 2064 [22297.546281] sd 0:0:6:0: [sdg] tag#21 CDB: Write(10) 2a 00 86 94 06 e8 00 04 00 00 [22297.546283] blk_update_request: I/O error, dev sdg, sector 2257848040 [22297.546326] md: super_written gets error=-5 [22297.546330] md/raid:md1: Disk failure on sdg1, disabling device. md/raid:md1: Operation continuing on 6 devices. [22297.546416] sd 0:0:6:0: [sdg] tag#20 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK [22297.546418] sd 0:0:6:0: [sdg] tag#20 CDB: Write(10) 2a 00 86 94 6a e8 00 01 18 00 [22297.546419] blk_update_request: I/O error, dev sdg, sector 2257873640 [22297.546493] sd 0:0:6:0: [sdg] tag#18 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK [22297.546494] sd 0:0:6:0: [sdg] tag#18 CDB: Write(10) 2a 00 86 94 66 e8 00 04 00 00 [22297.546495] blk_update_request: I/O error, dev sdg, sector 2257872616 [22297.546609] sd 0:0:6:0: [sdg] tag#17 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK [22297.546610] sd 0:0:6:0: [sdg] tag#17 CDB: Write(10) 2a 00 86 94 62 e8 00 04 00 00 [22297.546611] blk_update_request: I/O error, dev sdg, sector 2257871592 [22297.546715] sd 0:0:6:0: [sdg] tag#15 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK [22297.546716] sd 0:0:6:0: [sdg] tag#15 CDB: Write(10) 2a 00 86 94 5e e8 00 04 00 00 [22297.546717] blk_update_request: I/O error, dev sdg, sector 2257870568 [22322.045468] mpt2sas_cm0: log_info(0x31111000): originator(PL), code(0x11), sub_code(0x1000) It looks to me like the problem is somewhere in HBA driver but I guess it might as well be a problem with md or dm-crypt. Could anybody advice on how to solve this problem? Is it possible to enable more verbose debug output from kernel and drivers? Thanks Victor