> -----Original Message----- > From: Bernd Schubert [mailto:bernd.schubert@xxxxxxxxxxxxxxxxxx] > Sent: Thursday, August 04, 2011 6:39 PM > To: Desai, Kashyap > Cc: linux-scsi@xxxxxxxxxxxxxxx; Nandigama, Nagalakshmi; Prakash, Sathya; > Moore, Eric; JBottomley@xxxxxxxxxxxxx > Subject: Re: [PATCH 04/05] mptfusion: Fix for device offline while doing > aggressive HBA reset > > On 08/04/2011 01:13 PM, kashyap.desai@xxxxxxx wrote: > > Issue: > > Device goes offline while doing aggressive HBA reset > > along with IO using some utility. > > > > Root cause: > > FW goes into bad state due to aggressive reset. Softreset does > > not help to recover FW. And also aggressive reset open up the > > window for Error handling thread to kicked off at the same time > > HBA will be in constant RESET loop as part of aggressive reset > > test case can lead Device to goes offline. > > > > Changes: > > 1. Added extra check as below inside eh_timed_out call back as below. > > if(ioc->ioc_reset_in_progress) > > Rc = EH_TIMER_RESET > > 2. Removed " DOORBELL_ACTIVE" check for SAS controller from task > management context. > > Since SAS controller uses high priority queue for task management. > This check is > > not required for SAS controller. > > 3. Moved SoftReset call to HardReset from Task Mgmt context. > > [...] > > > > --- a/drivers/message/fusion/mptscsih.c > > +++ b/drivers/message/fusion/mptscsih.c > > @@ -1630,7 +1630,13 @@ mptscsih_IssueTaskMgmt(MPT_SCSI_HOST *hd, u8 > type, u8 channel, u8 id, int lun, > > return 0; > > } > > > > - if (ioc_raw_state& MPI_DOORBELL_ACTIVE) { > > + /* DOORBELL ACTIVE check is not required if > > + * MPI_IOCFACTS_CAPABILITY_HIGH_PRI_Q is supported. > > + */ > > + > > + if (!((ioc->facts.IOCCapabilities& > MPI_IOCFACTS_CAPABILITY_HIGH_PRI_Q) > > + && (ioc->facts.MsgVersion>= MPI_VERSION_01_05))&& > > + (ioc_raw_state& MPI_DOORBELL_ACTIVE)) { > > printk(MYIOC_s_WARN_FMT > > "TaskMgmt type=%x: ioc_state: " > > "DOORBELL_ACTIVE (0x%x)!\n", > > @@ -1729,7 +1735,7 @@ mptscsih_IssueTaskMgmt(MPT_SCSI_HOST *hd, u8 > type, u8 channel, u8 id, int lun, > > printk(MYIOC_s_WARN_FMT > > "Issuing Reset from %s!! doorbell=0x%08x\n", > > ioc->name, __func__, mpt_GetIocState(ioc, 0)); > > - retval = mpt_Soft_Hard_ResetHandler(ioc, CAN_SLEEP); > > + retval = mpt_HardResetHandler(ioc, CAN_SLEEP); > > mpt_free_msg_frame(ioc, mf); > > } > > Have you ever tested that with dual port 501030C parallel scsi HBAs? The > hard reset with those HBAs will reset *both* ports and eventually *both* > ports will fail. A couple of years ago I tried to convince Eric to > disable hard resets for those chips at all (and even sent a patch), but > Eric never agreed on that. > The soft-reset handler was a workaround for that problem, but with that > patch the issue will re-appear. The affected systems are still in > production and probably will still be for the next few years. I did not tried with dual port 501030C parallel scsi HBA.. I remember that exact issue you have described here. I can add check for ioc->bus_type == SAS to have HardReset and other case I will continue with SoftReset. Just wanted to know Is this fine to avoid issue which you have mentioned ? Pls let me know your view on it, so that I can resend the patch. ~ Kashyap > > > Thanks, > Bernd -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html