linux-scsi-owner@xxxxxxxxxxxxxxx wrote on 2015/07/31 21:17:33: > Hannes Reinecke <hare@xxxxxxx> > 发件人: linux-scsi-owner@xxxxxxxxxxxxxxx > > 2015/07/31 21:17 > > 收件人 > > jiang.biao2@xxxxxxxxxx, linux-scsi@xxxxxxxxxxxxxxx, JBottomley@xxxxxxxx, > > 抄送 > > 主题 > > Re: [Patch] scsi_error: should not get sense for timeout IO in scsi > error handler > > On 07/31/2015 11:52 AM, jiang.biao2@xxxxxxxxxx wrote: > > scsi_error: should not get sense for timeout IO in scsi error handler > > > > When an IO timeout occurs, the IO will be aborted in > > scsi_abort_command() and SCSI_EH_ABORT_SCHEDULED will be set. Because > > of that, the SCSI_EH_CANCEL_CMD will be clear in scsi_eh_scmd_add(). > > So when scsi error handler starts, it will get sense for this > > timeout IO and the scmd of the IO request will be reused. In that > > case, the scmd may be double released when racing with io_done(), > > which will result in crash. > > SO SCSI_EH_ABORT_SCHEDULED should also be checked when getting sense. > > The bug maybe reproduced when the link between host and disk is > > unstable. > > > > Signed-off-by: Jiang Biao <jiang.biao2@xxxxxxxxxx> > > Signed-off-by: Long Chun <long.chun@xxxxxxxxxx> > > Reviewed-by: Tan Hu <tan.hu@xxxxxxxxxx> > > Reviewed-by: Chen Donghai <chen.donghai@xxxxxxxxxx> > > Reviewed-by: Cai Qu <cai.qu@xxxxxxxxxx> > > > > diff -uprN drivers/scsi/scsi_error.c drivers_new/scsi/scsi_error.c > > --- scsi/scsi_error.c 2015-07-31 16:03:18.000000000 +0800 > > +++ scsi_new/scsi_error.c 2015-07-31 16:29:25.000000000 +0800 > > @@ -1156,9 +1156,14 @@ int scsi_eh_get_sense(struct list_head * > > struct Scsi_Host *shost; > > int rtn; > > > > + /* > > + * If SCSI_EH_ABORT_SCHEDULED has been set, it is timeout IO, > > + * should not get sense. > > + */ > > list_for_each_entry_safe(scmd, next, work_q, eh_entry) { > > if ((scmd->eh_eflags & SCSI_EH_CANCEL_CMD) || > > - SCSI_SENSE_VALID(scmd)) > > + (scmd->eh_eflags & SCSI_EH_ABORT_SCHEDULED) || > > + SCSI_SENSE_VALID(scmd)) > > continue; > > > > shost = scmd->device->host; > > -- > _Actually_ you need to test for both, SCSI_EH_CANCEL_CMD _and_ > SCSI_EH_ABORT_SCHEDULED. > Not every driver is required to implement and/or support > asynchronous command aborts, and those will be setting > SCSI_EH_CANCEL_CMD even though they've run into a timeout. > That's right, but SCSI_EH_CANCEL_CMD _has_ already been tested in the current code, so there's no need to add in the patch. After patched, both SCSI_EH_CANCEL_CMD _and_ SCSI_EH_ABORT_SCHEDULED are tested here, that'll ensure no getting sense for *timeout io*. Thanks. ��.n��������+%������w��{.n�����{������ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f