RE: [PATCH] mpt3sas: Fix calltrace observed while running IO & host reset

Chaitra Basappa <chaitra.basappa@xxxxxxxxxxxx> · Thu, 14 Jun 2018 15:56:43 +0530

Bart,
 Please see my replies inline.

Thanks,
 Chaitra

-----Original Message-----
From: Bart Van Assche [mailto:Bart.VanAssche@xxxxxxx]
Sent: Wednesday, June 13, 2018 9:22 PM
To: chaitra.basappa@xxxxxxxxxxxx; linux-scsi@xxxxxxxxxxxxxxx
Cc: sathya.prakash@xxxxxxxxxxxx; suganath-prabu.subramani@xxxxxxxxxxxx;
sreekanth.reddy@xxxxxxxxxxxx
Subject: Re: [PATCH] mpt3sas: Fix calltrace observed while running IO & host
reset

On Wed, 2018-06-13 at 15:46 +0530, Chaitra Basappa wrote:
>  When host reset is issued from application, through ioctl reset handler
> _ctl_do_reset() -> mpt3sas_base_hard_reset_handler() sets
> “ioc->shost_recovery” flag.
> If “ioc->shost_recovery” flag is set then driver will return all the
> incoming SCSI cmds with “SCSI_MLQUEUE_HOST_BUSY” in the scsih_qcmd(). And
> hence no new request gets processed by the driver until the reset
> completes,
> which guarantees that the smid won't change.

Hello Chaitra,

The patch at the start of this e-mail thread checks whether st->smid is
zero.
That check could only be useful if there would be code in the mpt3sas driver
that clears that field upon command completion. However, I haven't found any
such code in the mpt3sas driver.

[Chaitra]
Before starting the host reset operation, driver will set
"ioc->shost_recovery" flag to one, so during host reset time if driver
receives any IO commands then below check in scsih_qcmd() returns these scsi
commands with host busy status and hence these commands are not issued to
the HBA FW. So these scsi commands will not be outstanding at the driver
level, hence smid for these scsi commands will be zero and no need to flush
out these commands during host reset time.

        /* host recovery or link resets sent via IOCTLs */
        if (ioc->shost_recovery || ioc->ioc_link_reset_in_progress)
                return SCSI_MLQUEUE_HOST_BUSY;

As a part of host reset operation, driver will flush out all the scsi
commands which are outstanding at the driver level with "DID_RESET" result.

To determine whether scsi cmnds are outstanding at the driver level while
looping from 'tag' value zero to hba queue depth, driver will check for
below two fields from the scsiio_tracker

1. cb_idx == 0xFF : this means that scsi cmnd has completed from the driver,
so this command is not outstanding at the driver level. So this check itself
is enough to determine that scsi cmnd is completed            from the
driver and no need reset smid to zero.
But any way it is better to reset the smid field also to zero along with
cb_idx setting to 0xff. And hence we will re-post this patch with setting of
smid field in scsiio_tracker to zero upon completion of the scsi cmnd by the
driver.

2. smid == 0 (zero): this means that scsi cmnd has not issued to the HBA
firmware, so this command is not outstanding at the driver level. (current
driver was not checking this case and hence we are observing this issue. In
this patch we have added this check to fix this issue)

If cd_idx != 0xff && smid != 0 , this means that scsi cmnd is outstanding at
the driver level and Driver will flush this scsi cmnd with "DID_RESET"
during diag reset time.

Another concern is that setting ioc->shost_recovery prevents new calls of
scsih_qcmd() to submit any commands. But I don't think that setting that
flag
prevents any scsih_qcmd() calls that had already been started to submit a
new
command.

[Chaitra]
If scsi cmnd has already crossed the check for "ioc->shost_recovery" flag
(it means that scmd has been issued just before starting of host reset
operation) then such commands will be processed by driver , which assigns
valid 'smid' whose value b/w 1 and <= ioc->scsiio_depth (i.e. scsi cmnd's
tag value + 1)  thus these commands will be outstanding at driver level and
hence will be flushed out with "DID_RESET" during reset operation.

In other words, I don't think that checking whether or not st->smid == 0 is
sufficient to fix the reported race.

Bart.