From: Martin Wilck <mwilck@xxxxxxxx> The SCSI mid layer doesn't retry commands after DID_TIME_OUT (see scsi_noretry_cmd()). Packet loss in the fabric can cause spurious timeouts during SCSI device probing, causing device probing to fail. This has been observed in FCoE uplink failover tests, for example. This patch fixes the issue by retrying the INQUIRY up to 3 times (in practice, we never observed more than a single retry), Signed-off-by: Martin Wilck <mwilck@xxxxxxxx> Tested-by: Dave Prizer <dave.prizer@xxxxxxx> --- This patch was previously part of the series "Fixes for device probing on flaky connections", submitted on 2022/06/15. The first patch of the series has been dropped as discussed in the review process. Testing verified that just this patch was sufficient to solve the observed issues. --- drivers/scsi/scsi_scan.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 91ac901a66826..e859a648033f9 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -697,6 +697,11 @@ static int scsi_probe_lun(struct scsi_device *sdev, unsigned char *inq_result, (sshdr.ascq == 0)) continue; } + if (host_byte(result) == DID_TIME_OUT) { + SCSI_LOG_SCAN_BUS(3, sdev_printk(KERN_INFO, sdev, + "scsi scan: retry inquiry after timeout\n")); + continue; + } } else if (result == 0) { /* * if nothing was transferred, we try -- 2.37.1