On Mon, 2018-04-02 at 11:53 +0530, Sreekanth Reddy wrote: > On Fri, Mar 30, 2018 at 5:29 PM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: > > On Fri, Mar 30, 2018 at 03:07:12PM +0530, Chaitra P B wrote: > > > Check scsi tracker for NULL before accessing it. > > > And in some places there are possibilities for getting valid st > > > but still other fields are not set. > > > > Please explain how that could ever happen. You should never see > > a scsi_cmnd without the device pointer. > > Chris, > > Here is one example, During Host reset operation time driver will > flush out all the outstanding IO's to the SML in below function, > > static void > _scsih_flush_running_cmds(struct MPT3SAS_ADAPTER *ioc) > { > struct scsi_cmnd *scmd; > struct scsiio_tracker *st; > u16 smid; > int count = 0; > > [SR] Here driver is looping starting from smid one to HBA queue depth. > for (smid = 1; smid <= ioc->scsiio_depth; smid++) { > > [SR] Some times it is possible that scsi_cmnd might have created at > SML but it might not be issued to the driver as host reset operation > is going on, So here we will get non-NULL scmd. > scmd = mpt3sas_scsih_scsi_lookup_get(ioc, smid); > if (!scmd) > continue; > count++; > _scsih_set_satl_pending(scmd, false); > [SR] Here we are trying to get the scsi tracker 'st' for the above > scmd (which is not received by the driver) and so fields of this 'st' > are uninitialized. > st = scsi_cmd_priv(scmd); > [SR] And here we are trying to clear the scsi tracker 'st' which is > not yet all initialized by the driver, in other terms we are trying to > flush out the scmd command which is not yet all received by the > driver. > mpt3sas_base_clear_st(ioc, st); > scsi_dma_unmap(scmd); > if (ioc->pci_error_recovery || ioc->remove_host) > scmd->result = DID_NO_CONNECT << 16; > else > scmd->result = DID_RESET << 16; > scmd->scsi_done(scmd); > } > dtmprintk(ioc, pr_info(MPT3SAS_FMT "completing %d cmds\n", > ioc->name, count)); > } Hello Sreekanth, >From mpt3sas_scsih_scsi_lookup_get(): st = scsi_cmd_priv(scmd); if (st->cb_idx == 0xFF) scmd = NULL; >From mpt3sas_base_get_smid(), mpt3sas_base_get_smid_scsiio() and mpt3sas_base_get_smid_hpr(): request->cb_idx = cb_idx; Can _scsih_flush_running_cmds() be executed concurrently with scsih_qcmd()? In other words, can it happen that mpt3sas_scsih_scsi_lookup_get() sees that st->cb_idx == 0xff just before scsih_qcmd() assigns a value to .cb_idx? Can this cause _scsih_flush_running_cmds() to skip commands that it shouldn't skip? Can it happen that _scsih_flush_running_cmds() calls .scsi_done() for a SCSI command just after scsih_qcmd() transferred control for that command to the firmware? Can that cause .scsi_done() to be called twice for the same command? Thanks, Bart.