On 10/02/2016 07:51 PM, Dāvis Mosāns wrote: > I've HighPoint RocketRAID 2760A which uses mvsas driver. > > And I need to increase it's timeout because it timeouts too early and > doesn't allow HDD to finish it's recovery routine for unreadable > sector (that HDD doesn't support TLER) > > I've increased > > # echo 300 > /sys/block/sdd/device/timeout > # echo 300 > /sys/block/sdd/device/eh_timeout > > But it didn't gave any effect, it still timeouts in ~8 seconds. > > > # hdparm --read-sector 3021567960 /dev/sdd > /dev/sdd: > reading sector 3021567960: FAILED: Input/output error > > > [17226.257531] /mnt/linux/drivers/scsi/mvsas/mv_sas.c 1771:port 2 slot > 0 rx_desc 30000 has error info0000000001000000. > [17226.266698] sas: Enter sas_scsi_recover_host busy: 1 failed: 1 > [17226.266707] sas: ata21: end_device-7:2: cmd error handler > [17226.266740] sas: ata7: end_device-7:0: dev error handler > [17226.266750] sas: ata8: end_device-7:1: dev error handler > [17226.266760] sas: ata21: end_device-7:2: dev error handler > [17226.266767] sas: ata10: end_device-7:3: dev error handler > [17226.266772] ata21.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 > [17226.266778] ata21.00: failed command: READ SECTOR(S) EXT > [17226.266781] sas: ata12: end_device-7:5: dev error handler > [17226.266787] sas: ata11: end_device-7:4: dev error handler > [17226.266793] sas: ata13: end_device-7:6: dev error handler > [17226.266795] sas: ata14: end_device-7:7: dev error handler > [17226.266813] ata21.00: cmd 24/00:01:d8:77:19/00:00:b4:00:00/e0 tag > 21 pio 512 in > res 51/40:00:d8:77:19/00:00:b4:00:00/00 Emask > 0x9 (media error) > [17226.266820] ata21.00: status: { DRDY ERR } > [17226.266825] ata21.00: error: { UNC } > [17226.330498] ata21.00: failed to IDENTIFY (I/O error, err_mask=0x1) > [17226.330506] ata21.00: revalidation failed (errno=-5) > [17226.330514] ata21: hard resetting link > [17226.483739] ata21.00: failed to IDENTIFY (I/O error, err_mask=0x1) > [17226.483746] ata21.00: revalidation failed (errno=-5) > [17228.669337] hpet1: lost 331 rtc interrupts > [17230.689985] hpet1: lost 129 rtc interrupts > [17231.483422] ata21: hard resetting link > [17231.637199] ata21.00: failed to IDENTIFY (I/O error, err_mask=0x1) > [17231.637207] ata21.00: revalidation failed (errno=-5) > [17231.637212] ata21.00: disabled > [17231.637252] ata21: EH complete > [17231.637275] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: > > After this that disk isn't accessible at all until it's physically > disconnected and reconnected. > Well, this looks more like the ATA error recovery not working properly; libata-eh is trying to reset the link (that's the 'hard resetting link' message), but after that the device doesn't respond (that's the 'failed to IDENTIFY' message). So it's not so much a wrong timeout, it's a wrong EH implementation. We would need to check why mvsas hard reset is not working; I've seen a similar issue on isci, but haven't been able to debug things properly. So it might even be a generic libsas EH issue, and not related to mvsas at all. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@xxxxxxx +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html