+/* + * cmd line parameters + */ +static int mpt_mpi_busy; +module_param(mpt_mpi_busy, int, 0);+MODULE_PARM_DESC(mpt_mpi_busy, " MPT MPI busy workaround for VMWare ESX (default=0)");
+ /*=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=*/ typedef struct _BIG_SENSE_BUF { @@ -704,10 +711,13 @@ sc->resid=0; case MPI_IOCSTATUS_SCSI_RECOVERED_ERROR: /* 0x0040 */ case MPI_IOCSTATUS_SUCCESS: /* 0x0000 */ - if (scsi_status == MPI_SCSI_STATUS_BUSY)+ if ((scsi_status == MPI_SCSI_STATUS_BUSY) && !mpt_mpi_busy) sc->result = (DID_BUS_BUSY << 16) | scsi_status;
- else + else { + if (mpt_mpi_busy)+ printk(KERN_INFO "MPT MPI ESX busy hack enabled ... waiting\n");
sc->result = (DID_OK << 16) | scsi_status; + } if (scsi_state == 0) { ;} else if (scsi_state & MPI_SCSI_STATE_AUTOSENSE_VALID) {
The prink(KERN... could be set if mpt_mpi_busy == 2 to make debugging of the situation optional
Manon --On 9. Januar 2007 10:17:17 -0600 Michael Reed <mdr@xxxxxxx> wrote:
Moore, Eric wrote:On Monday, January 08, 2007 3:25 PM, James Bottomley wrote:Right, I sort of suspected something like this. BUSY/QUEUE_FULL handling was a bit iffy in 2.4; but it was sorted out in the 2003/4 timeframe. Nowadays, I think you want to translate the MPI_SCSI_STATUS_BUSY directly to SAM_STAT_BUSY (i.e. just remove the special casing if).Christoph put in code to limit a command's lifetime to prevent infinite loops in the case of QUEUE_FULL and BUSY. (See scsi_softirq_done() for implementation.) DID_OK / COMMAND_COMPLETE / BUSY results in a ADD_TO_MLQUEUE for a retry, same as QUEUE_FULL. I don't infinite retries, just a whole lot of them. See scsi_decide_disposition(). MikeI think your'e on the same page with the folks from VMware, where the've asked us to go back to our old driver code. Meaning we kill the check for "MPI_SCSI_STATUS_BUSY", instead the sam status is sent back "as is" without changing the DID_OK to DID_BUS_BUSY, etc. My problem with that is whether is breaks the Fibre Channel Folks. Will FC failover solution work properly if we go back to the old code? I add Stephen Shirron and Mike Reed. I don't know. Here is an explanation why that fix was needed back about a year ago: "When a target device responds with BUSY status, the MPT driver was sending DID_OK to the SCSI mid layer, which caused the IO to be retried indefinitely between the mid layer and the driver. By changing the driver return status to DID_BUS_BUSY, the target BUSY status can now flow through the mid layer to an upper layer Failover driver, which will manage the I/O timeout." - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
Manon Goo Dembach Goo Informatik GmbH & Co KG Rathenauplatz 9 D-50674 Köln Tel: +49 221 801483 0 Mobil: +49 177 8091974 Fax: +49 221 801483 20 Email: manon@xxxxxxxx
Attachment:
pgp6NOOtSJqHh.pgp
Description: PGP signature