[Patch] qla2xxx: shoud ensure no *io done* after qla2xxx_eh_abort returning SUCCESS to avoid race between *io timeout* and *io done*.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



qla2xxx: shoud ensure no *io done* after qla2xxx_eh_abort returning
SUCCESS to avoid race between *io timeout* and *io done*.

LLDD driver should ensure that there is no *io done* after eh_abort
(*qla2xxx_eh_abort*) returning SUCCESS to avoid the race between
*io timeout* and *io done*. But qla2xxx driver can not guarantee that.

If *io done* occurs when qla2xxx_eh_abort is in process, it may
return SUCCESS without actually aborting the cmd. In that case, the
race between *io timeout* and *io done* appears. Although there is
REQ_ATOM_COMPLETE flag trying to avoid the race, there is also race
possibility that may result in crash.

The following is one of the race cases,
         CPU 1                             CPU2
scsi_try_to_abort_cmd()->
   qla2xxx_eh_abort()
                                   *io done*->qla2x00_sp_compl()->
                                      qla2x00_sp_free_dma()
      return SUCCESS->
   *retry*->scsi_queue_insert()->
      blk_clear_rq_complete()
                                   scsi_done()->
                                      blk_complete_request()

In this race case, because the REQ_ATOM_COMPLETE flag has been
clear in blk_clear_rq_complete, the *io done* will try to complete
the IO request that has already been requeued. When the requeued IO
completes again, another *io done* of the same IO request will start
again. That means the same IO request will be finished twice, which
will result in double releasing the resources of the request, then
lead to crash.

So when *io done* happens during qla2xxx_eh_abort, we should wait
there till *io done* finishs, ensuring no *io done* after
qla2xxx_eh_abort returning SUCCESS, which avoids race between 
*io timeout* and *io done*.

Signed-off-by: Jiang Biao <jiang.biao2@xxxxxxxxxx>
Signed-off-by: Tan Hu <tan.hu@xxxxxxxxxx>
Reviewed-by: Cai Qu <cai.qu@xxxxxxxxxx>

diff -uprN scsi/qla_os.c scsi_new/qla_os.c
--- scsi/qla_os.c       2015-07-31 19:39:06.000000000 +0800
+++ scsi_new/qla_os.c   2015-07-31 19:46:20.000000000 +0800
@@ -939,8 +939,15 @@ qla2xxx_eh_abort(struct scsi_cmnd *cmd)
        int rval, wait = 0;
        struct qla_hw_data *ha = vha->hw;

-       if (!CMD_SP(cmd))
+       if (!CMD_SP(cmd)) {
+               /*
+                * Wait until io done finishs to avoid race between 
+                * io timeout and io done.
+                */
+               spin_lock_irqsave(&ha->hardware_lock, flags);
+               spin_unlock_irqrestore(&ha->hardware_lock, flags);
                return SUCCESS;
+       }

        ret = fc_block_scsi_eh(cmd);
        if (ret != 0)
@@ -996,8 +1003,15 @@ qla2xxx_eh_abort(struct scsi_cmnd *cmd)
        spin_unlock_irqrestore(&ha->hardware_lock, flags);

        /* Did the command return during mailbox execution? */
-       if (ret == FAILED && !CMD_SP(cmd))
+       if (ret == FAILED && !CMD_SP(cmd)) {
+               /*
+                * Wait until io done finishs to avoid race between 
+                * io timeout and io done.
+                */
+               spin_lock_irqsave(&ha->hardware_lock, flags);
+               spin_unlock_irqrestore(&ha->hardware_lock, flags);
                ret = SUCCESS;
+       }

        /* Wait for the command to be returned. */
        if (wait) {
��.n��������+%������w��{.n�����{������ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux