On 3/13/2019 2:44 PM, Bart Van Assche wrote:
On Wed, 2019-03-13 at 13:19 -0700, James Smart wrote:
I've got an oops for the following stack:
...
lpfc_send_taskmgmt+0x28a [lpfc]
lpfc_bus_reset_handler+0x16a [lpfc]
scsi_try_bus_reset+0x3a
scsi_ioctl_reset+0x143
scsi_ioctl+0x18e [sg]
The issue is that lpfc eventually calls blk_mq_unique_tag() as
everything is scsi-mq enabled. blk_mq_unique_tag() oops as rq->mq_hctx
is NULL. It is NULL as scsi_ioctl_reset built up a dummy request struct
without a queue:
blk_rq_init(NULL, rq);
scmd = (struct scsi_cmnd *)(rq + 1);
scsi_init_command(dev, scmd);
scmd->request = rq;
scmd->cmnd = scsi_req(rq)->cmd;
What's the best way to approach fixing this ?
SCSI LLD bus reset handlers may use the members initialized by scsi_ioctl_reset()
but must not assume that all other struct scsi_cmnd members are valid. Several
years ago a similar crash was fixed in ib_srp.
Bart.
seems rather nuanced to me that scsi_cmnd->request may be non-null, yet
elements of the request may not be initialized.
Or said another way - the driver is to discern cmnds used for reset
requests from cmnds issued through the queuecommand - and *not* perform
blk_mq operations on the reset requests ?
which seems wrong as well - you want every driver to invent it's own way
to know actions are via a reset handler and to special case the cmnd
use. Why don't we fix it at the cause rather than making every driver
fix it ?
And did we document this in the mid/low api ?
-- james