RE: [PATCH] mpt2sas: Remove queuecommand wrapper

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Matthew Wilcox [mailto:willy@xxxxxxxxxxxxxxx]
> Sent: Tuesday, July 05, 2011 8:52 PM
> To: Desai, Kashyap
> Cc: Matthew Wilcox; linux-scsi@xxxxxxxxxxxxxxx; DL-MPT Fusion Linux;
> nab@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH] mpt2sas: Remove queuecommand wrapper
> 
> On Tue, Jul 05, 2011 at 08:25:43PM +0530, Desai, Kashyap wrote:
> > > > e.a if scsih_qcmd_xxx has executed half of the code and (due to
> preemption is not \
> > > > disable for the same CPU), Scheduler can execute any other process
> on the SAME cpu \
> > > > (Though IRQ is disable). Consider Error handling is kicked off on
> the same CPU and as \
> > > > part of EH, it executed HBA reset. As part of HBA reset Driver has
> return back all \
> > > > pending Scsi command to mid-layer, and once control come back to
> original \
> > > > scsih_qcmd_xxx LLD drive will see bad results (Kernel crash/Data
> corruption/h/w hung \
> > > > or anything critical.....)
> > >
> > > I don't believe this scenario can happen.  Error handling will not
> > > commence until all commands are returned to the midlayer, which
> requires
> > > that queuecommand is no longer running.
> >
> > I was able to see this scenario when I intially started doing code
> review. Since host_lock is removed Error handling can remove code from
> queuecommand anytime ( I asssume preemption is not disable, only IRQ is
> disable).. I am almost sure if we are planning to remove host_lock from
> SML, we at least need irq and preemption disable for mpt2sas and
> mptfusion to work properly.
> 
> I'm pretty sure that can't happen.  Look:
> 
> void scsi_eh_wakeup(struct Scsi_Host *shost)
> {
>         if (shost->host_busy == shost->host_failed) {

I think you are right here. 

I did below experiment.  
Since now host_lock is removed from dispatch() callback.
I added sleep() inside scsih_qcmd().. Just to mimic preemption. And called Host reset using
"Sg_reset -h" at the sametime when driver is sleeping inside qcmd(). I see the kernel crash due to invalid scsi command pointer access after driver comes out from sleep().

 
As you said, due to actual Error handling will not wake up, but if user/customer do testing using sg_tools
They can kick off Host reset on demand.

What is your input on my above test case/observation ?

>                 trace_scsi_eh_wakeup(shost);
>                 wake_up_process(shost->ehandler);
> 
> So the SCSI error handler doesn't run until host_busy == host_failed.
> host_busy is incremented in scsi_request_fn() (before scsi_dispatch_cmd
> is called).  host_failed is incremented in scsi_eh_scmd_add().  So after
> a command has been issued, it prevents the EH from running until it
> too fails.  That means queuecommand cannot be running at the same time
> as the EH.
> 
> That's documented in Documentation/scsi/scsi_eh.txt section [1-3].
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux