Re: qla2xxx lock ordering question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 27 Apr 2006, Arjan van de Ven wrote:

> On Thu, 2006-04-20 at 16:46 -0700, Andrew Vasquez wrote:
> > On Wed, 19 Apr 2006, Arjan van de Ven wrote:
> > 
> > > a question about qla2xxx lock ordering since it trips up with Ingo's
> > > lock depenceny tool:
> > > 
> > > in qla2x00_mailbox_command() the code first grabs the mbx_reg_lock lock,
> > > then the hardware_lock. So far so good. But then...
> > > it drops the mbx_reg_lock, does stuff, and regrabs the mbx_reg_lock
> > > lock, while keeping the hardware_lock held!
> > > 
> > > This appears to be an AB-BA deadlock risk since for the second part you
> > > are taking the locks in the wrong order... or am I missing something
> > > here?
> > 
> > Actually the code is a bit convoluted, but I believe we are OK.  There
> > are two scenarios we need to consider:
> 
> 
> ok found the real culprits ;)
> 
> in qla2x00_reset_chip the driver first takes the hardware lock, and then
> later on takes the mbx lock.
> 
> In the mailbox_command code.. it goes the other way around.

Ahh yes... that is suspicious...

> Now I don't know if reset_chip can in theory race with the mailbox code,
> but I'd not be surprised since reset_chip is called in error recovery
> iirc...

ISP chip recovery (resulting in a subsequent call to
qla2x00_reset_chip()) can only take place from the DPC thread context
(that at least drops most of mailbox command initiators, since we are
single threaded).  But, we are still left with mid-layer
error-handling initiators of mailbox commands (abort, device reset and
loop reset).

Hmm...  we do have a small window there where we could race.  In going
through the locking hierarchy though, I'm not entirely convinced
taking the mbx_reg_lock while resetting is needed as it's only meant
to protect an HA's mbx_cmd_flags state; which is only set during
interrupt processing (after we've released the hardware_lock).  The
reset-chip code has also disabled interrupts, further reducing the
window...

I'll do some testing with the following patch which drops the
acquisition of mbx_reg_lock.  Baring any major objections locally,
I'll post a formal patch to ls next week.

BTW: cool tool.

Thanks,
Andrew Vasquez

---

diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index 89a3fc0..7dc4067 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -519,20 +519,8 @@ qla2x00_reset_chip(scsi_qla_host_t *ha)
 
 	if (IS_QLA2100(ha) || IS_QLA2200(ha) || IS_QLA2300(ha)) {
 		for (cnt = 0; cnt < 30000; cnt++) {
-			if (!(test_bit(ABORT_ISP_ACTIVE, &ha->dpc_flags)))
-				spin_lock_irqsave(&ha->mbx_reg_lock, mbx_flags);
-
-			if (RD_MAILBOX_REG(ha, reg, 0) != MBS_BUSY) {
-				if (!(test_bit(ABORT_ISP_ACTIVE,
-				    &ha->dpc_flags)))
-					spin_unlock_irqrestore(
-					    &ha->mbx_reg_lock, mbx_flags);
+			if (RD_MAILBOX_REG(ha, reg, 0) != MBS_BUSY)
 				break;
-			}
-
-			if (!(test_bit(ABORT_ISP_ACTIVE, &ha->dpc_flags)))
-				spin_unlock_irqrestore(&ha->mbx_reg_lock,
-				    mbx_flags);
 
 			udelay(100);
 		}

> 

-
: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux