Adding VMware engineering... -----Original Message----- From: Manon Goo [mailto:manon@xxxxxxxx] Sent: Tuesday, January 09, 2007 9:49 AM To: Michael Reed; Moore, Eric; David Berghoff Cc: James Bottomley; Adam Zimman; linux-scsi@xxxxxxxxxxxxxxx; Shirron, Stephen Subject: Re: [PATCH 2/5] fusion: vmware bug fix prevent inifinite retries Hmm .... why don't w make the whole thing configurable (david implemented this for us) +/* + * cmd line parameters + */ +static int mpt_mpi_busy; +module_param(mpt_mpi_busy, int, 0); +MODULE_PARM_DESC(mpt_mpi_busy, " MPT MPI busy workaround for VMWare ESX (default=0)"); + /*=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= */ typedef struct _BIG_SENSE_BUF { @@ -704,10 +711,13 @@ sc->resid=0; case MPI_IOCSTATUS_SCSI_RECOVERED_ERROR: /* 0x0040 */ case MPI_IOCSTATUS_SUCCESS: /* 0x0000 */ - if (scsi_status == MPI_SCSI_STATUS_BUSY) + if ((scsi_status == MPI_SCSI_STATUS_BUSY) && !mpt_mpi_busy) sc->result = (DID_BUS_BUSY << 16) | scsi_status; - else + else { + if (mpt_mpi_busy) + printk(KERN_INFO "MPT MPI ESX busy hack enabled ... waiting\n"); sc->result = (DID_OK << 16) | scsi_status; + } if (scsi_state == 0) { ; } else if (scsi_state & MPI_SCSI_STATE_AUTOSENSE_VALID) { The prink(KERN... could be set if mpt_mpi_busy == 2 to make debugging of the situation optional Manon --On 9. Januar 2007 10:17:17 -0600 Michael Reed <mdr@xxxxxxx> wrote: > > > Moore, Eric wrote: >> On Monday, January 08, 2007 3:25 PM, James Bottomley wrote: >> >>> Right, I sort of suspected something like this. BUSY/QUEUE_FULL >>> handling was a bit iffy in 2.4; but it was sorted out in the 2003/4 >>> timeframe. Nowadays, I think you want to translate the >>> MPI_SCSI_STATUS_BUSY directly to SAM_STAT_BUSY (i.e. just remove the >>> special casing if). > > Christoph put in code to limit a command's lifetime to prevent infinite > loops in the case of QUEUE_FULL and BUSY. (See scsi_softirq_done() > for implementation.) > > DID_OK / COMMAND_COMPLETE / BUSY results in a ADD_TO_MLQUEUE for a retry, > same as QUEUE_FULL. I don't infinite retries, just a whole lot of them. > See scsi_decide_disposition(). > > Mike > >>> >> >> I think your'e on the same page with the folks from VMware, >> where the've asked us to go back to our old driver code. >> Meaning we kill the check for "MPI_SCSI_STATUS_BUSY", instead the sam >> status >> is sent back "as is" without changing the DID_OK to DID_BUS_BUSY, etc. >> >> My problem with that is whether is breaks the Fibre Channel Folks. >> Will FC failover solution work properly if we go back to the old code? >> I add Stephen Shirron and Mike Reed. >> I don't know. Here is an explanation why that fix was needed back >> about a year ago: >> >> >> "When a target device responds with BUSY status, the MPT driver was >> sending DID_OK to the >> SCSI mid layer, which caused the IO to be retried indefinitely between >> the mid layer and the >> driver. By changing the driver return status to DID_BUS_BUSY, the >> target BUSY status can >> now flow through the mid layer to an upper layer Failover driver, which >> will manage the I/O timeout." >> >> - >> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> Manon Goo Dembach Goo Informatik GmbH & Co KG Rathenauplatz 9 D-50674 Köln Tel: +49 221 801483 0 Mobil: +49 177 8091974 Fax: +49 221 801483 20 Email: manon@xxxxxxxx - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html