On Mon, Sep 06, 2021 at 02:50:03PM +0800, Ming Lei wrote: > blk-mq can't run allocating driver tag and updating ->rqs[tag] > atomically, meantime blk-mq doesn't clear ->rqs[tag] after the driver > tag is released. > > So there is chance to iterating over one stale request just after the > tag is allocated and before updating ->rqs[tag]. > > scsi_host_busy_iter() calls scsi_host_check_in_flight() to count scsi > in-flight requests after scsi host is blocked, so no new scsi command can > be marked as SCMD_STATE_INFLIGHT. However, driver tag allocation still can > be run by blk-mq core. One request is marked as SCMD_STATE_INFLIGHT, > but this request may have been kept in another slot of ->rqs[], meantime > the slot can be allocated out but ->rqs[] isn't updated yet. Then this > in-flight request is counted twice as SCMD_STATE_INFLIGHT. This way causes > trouble in handling scsi error. > > Fixes the issue by not iterating over stale request. > > Cc: linux-scsi@xxxxxxxxxxxxxxx > Cc: "Martin K. Petersen" <martin.petersen@xxxxxxxxxx> > Reported-by: luojiaxing <luojiaxing@xxxxxxxxxx> > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> Hello Jens, luojiaxiang has verified that this patch fixes his issue, any chance to merge it? Thanks, Ming