On Saturday, January 06, 2007 8:31 AM, James Bottomley wrote: > > DID_BUS_BUSY causes an immediate retry, but it does debit the retry > count, so it shouldn't cause "infinite retries" ... if it > does, there's > something else wrong here. > > I should also point out that the MPI_SCSI_STATUS_BUSY is > SAM_STAT_BUSY ... this return will cause a queue stop and a > requeue, but > it doesn't actually debit the retries, so it *may* cause an infinite > loop if the system is permanently busy. > > Finally, whatever's causing this, it should probably be > treated the same > for all fusion bus types ... > James - I was incorrect in the way I worded this patch. Please read further. Original request came to me an you from Manon Goo <manon@xxxxxxxx> on November 21, see attached. Here is what VMware says, per Adam Zimman <azimman@xxxxxxxxxx>: "VMkernel emulates a 1030/SPI. Path Failovers induce the vmkernel to return a BUSY status for VM initiated SCSI I/O requests. After a few I/O commands are returned with BUSY status, the RHEL VM will make the disk read-only. The host status of DID_BUS_BUSY causes the RHEL scsi error recovery process to retry a BUSY I/O at most 5 times and then return an I/O failure upward in the I/O stack. If the I/O request failed with a scsi status of BUSY rather than a host status of DID_BUS_BUSY, the RHEL scsi error recovery process would retry the I/O indefinitely." In the 03.02.19, we add added the current logic for the following reason: "When a target device responds with BUSY status, the MPT driver was sending DID_OK to the SCSI mid layer, which caused the IO to be retried indefinitely between the mid layer and the driver. By changing the driver return status to DID_BUS_BUSY, the target BUSY status can now flow through the mid layer to an upper layer Failover driver, which will manage the I/O timeout." Eric
--- Begin Message ---
- To: "Moore, Eric" <Eric.Moore@xxxxxxx>, <James.Bottomley@xxxxxxxxxxxx>
- Subject: concerning mptscsih.c
- From: "Manon Goo" <manon@xxxxxxxx>
- Date: Tue, 21 Nov 2006 18:39:26 -0700
- Cc: "Andreas Dembach" <ad@xxxxxxxx>, "David Berghoff" <david@xxxxxxxx>
- Reply-to: "Manon Goo" <manon@xxxxxxxx>
- Thread-index: AccN1xAfVOJ7JmFES22NGAYhC50wXg==
- Thread-topic: concerning mptscsih.c
Dear Sirs,When changing from kernel 2.6.13 to 2.6.14 a change to the mtpscsih.c driver was introduced thet changed the behaviour of the driver in respect to timeouts.The introduced changes around line 760. As far as I undestand this change propagets a bussy device running async as a host failture. This is extremely troublesome when using the mptscsi driver with vmware ESX because esx expects the driver to wait when doing SAN pathfailovers or going async.Is there any chance to have the old behavior ? Thanks in advance Manon Goo break; + case MPI_IOCSTATUS_SCSI_DATA_OVERRUN: /* 0x0044 */ + sc->resid=0; case MPI_IOCSTATUS_SCSI_RECOVERED_ERROR: /* 0x0040 */ case MPI_IOCSTATUS_SUCCESS: /* 0x0000 */ - scsi_status = pScsiReply->SCSIStatus; - sc->result = (DID_OK << 16) | scsi_status; + if (scsi_status == MPI_SCSI_STATUS_BUSY)+ sc->result = (DID_BUS_BUSY << 16) | scsi_status;+ else + sc->result = (DID_OK << 16) | scsi_status; if (scsi_state == 0) { ;} else if (scsi_state & MPI_SCSI_STATE_AUTOSENSE_VALID) {Manon Goo Dembach Goo Informatik GmbH & Co KG Rathenauplatz 9 D-50674 Köln Tel: +49 221 801483 0 Mobil: +49 177 8091974 Fax: +49 221 801483 20 Email: manon@xxxxxxxxAttachment: pgpCzBTAlg0ad.pgp
Description: PGP signature
--- End Message ---