[PATCH] scsi_wq (fc transport) thread hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I was testing changes to the LSI fc driver and managed to wedge the
fc transport's scsi_wq thread.  Here's what caused it.

reset board via lsiutil command

	(While I may not be 100% correct in the actual sequence
	of resetting the board, I'd say I'm close enough to
	accurately describe the problem.)

	reset function 0
		fc_remote_port_delete() of all ports on
		both functions (board just works that way)

	board sends rescan event after reset completes
	which causes
		fc_remote_port_add() / fc_remote_port_rolechg()
		of all targets on both functions

	reset function 1
		fc_remote_port_delete() of all ports on
		both functions

		a scan was in progress for the recently added
		rports.  the delete blocks the target(s).
		One of the scans was issuing scsi commands
		at the time of the block.

	board sends rescan event after reset completes
	for second function which causes
		fc_remote_port_add() / fc_remote_port_rolechg()
		of all targets on both functions, again.

		rolechg again queues the scan work for the
		target which was blocked while a scan was in
		progress.

		scsi_target_unblock() is part of the
		scan work and hence isn't called.

		nothing unblocks the target so the scan hangs.

After turning on debug output and adding a strategically placed
printk, this output shows that an rport is being deleted while
a scan is in progress/scheduled.  As there is nothing which will
unblock the target which has scan work in progress once the resets
complete, the scan (and the thread) hang.

I've attached a patch which corrects the in my test config.  This
patch has been previously forwarded to James Smart, but as he's been
unable to respond (my patience is sometimes lacking) I decided to
post to linux-scsi to make others aware of the problem.  (I know,
I should have done this in parallel with the email to James....)


May  3 13:17:45 duck kernel: mptbase: ioc3: Sending Config request type 5, page 1 and action 0
May  3 13:17:45 duck kernel: mptbase: ioc3: config_complete (mf=e00000b005e1ac00,mr=e00000b005e00050)
May  3 13:17:45 duck kernel:   IOCStatus=0000h, IOCLogInfo=00000000h
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130bd9, 20000011c61e3afb / 21000011c61e3afb, tid 1, rport tid 1, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130ad9, 20000011c61e3af8 / 21000011c61e3af8, tid 3, rport tid 3, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130bd6, 20000011c61e3adb / 21000011c61e3adb, tid 5, rport tid 5, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130ace, 20000011c61e3ad9 / 21000011c61e3ad9, tid 7, rport tid 7, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130ad5, 20000011c61e3ad5 / 21000011c61e3ad5, tid 9, rport tid 9, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130bd5, 20000011c61e3a2d / 21000011c61e3a2d, tid 11, rport tid 11, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130ad4, 20000011c61e39f9 / 21000011c61e39f9, tid 13, rport tid 13, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130ad1, 20000011c61e39d1 / 21000011c61e39d1, tid 15, rport tid 15, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130bd3, 20000011c61e1a41 / 21000011c61e1a41, tid 17, rport tid 17, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130ad2, 20000011c61dec10 / 21000011c61dec10, tid 19, rport tid 19, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130bd1, 20000011c61de9fa / 21000011c61de9fa, tid 21, rport tid 21, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130bd4, 20000011c61de980 / 21000011c61de980, tid 23, rport tid 23, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130ad3, 20000011c61de970 / 21000011c61de970, tid 25, rport tid 25, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_qcmd.4: 1:0, mptscsih_qcmd returns non-zero, (1055).
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130bce, 20000011c61dd86c / 21000011c61dd86c, tid 27, rport tid 27, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130bd2, 20000011c61dd851 / 21000011c61dd851, tid 29, rport tid 29, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130ad6, 20000011c61dd831 / 21000011c61dd831, tid 31, rport tid 31, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130c00, 200700a0b81130aa / 204700a0b81130aa, tid 32, rport tid 32, tmo 60
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_reg_dev.4: 130d00, 200700a0b81130aa / 202700a0b81130aa, tid 34, rport tid 34, tmo 60
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130bd9 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61e3afb deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130ad9 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61e3af8 deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130bd6 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61e3adb deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130ace with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61e3ad9 deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130ad5 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61e3ad5 deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130bd5 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61e3a2d deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130ad4 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61e39f9 deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130ad1 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61e39d1 deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130bd3 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61e1a41 deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130ad2 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61dec10 deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130bd1 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61de9fa deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130bd4 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61de980 deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130ad3 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61de970 deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130bce with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61dd86c deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130bd2 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61dd851 deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130ad6 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 21000011c61dd831 deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130c00 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 204700a0b81130aa deleted
May  3 13:17:45 duck kernel: fc_remote_port_delete.4: blocking rport 130d00 with scan pending
May  3 13:17:45 duck kernel: mptfc: ioc2: mptfc_setup_reset.4: 202700a0b81130aa deleted
May  3 13:17:45 duck kernel: mptbase: ioc3: Sending Config request type 5, page 1 and action 1
May  3 13:17:45 duck kernel: mptbase: ioc3: config_complete (mf=e00000b005e1ae80,mr=e00000b005e000a0)
May  3 13:17:45 duck kernel:   IOCStatus=0000h, IOCLogInfo=00000000h
May  3 13:17:45 duck kernel: mptbase: ioc3: Sending Config request type 5, page 1 and action 0
May  3 13:17:45 duck kernel: mptbase: ioc3: config_complete (mf=e00000b005e1af80,mr=e00000b005e000f0)
May  3 13:17:45 duck kernel:   IOCStatus=0000h, IOCLogInfo=00000000h
May  3 13:17:45 duck kernel: mptbase: ioc3: Sending Config request type 5, page 1 and action 2
May  3 13:17:45 duck kernel: mptbase: ioc3: config_complete (mf=e00000b005e1b080,mr=e00000b005e00140)
May  3 13:17:45 duck kernel:   IOCStatus=0000h, IOCLogInfo=00000000h


Mike

		

It is possible for a transport initiated scan of a target to be in progress
with scsi commands outstanding when the lldd calls the transport to (again)
delete the device via fc_remote_port_delete().  The transport will then (again)
block the target.  When the lldd later calls fc_remote_port_add() for the
blocked target, the scan work is requeued but nothing unblocks the target
so that the active can complete.  (The unblock is integrated within the
scan work.)

This patch tests the pending flag and will either unblock the target if a
scan is already in progress or will initiate a scan which will unblock
the target when the work executes.

Signed-off-by: Michale Reed <mdr@xxxxxxx>



--- rc3u/drivers/scsi/scsi_transport_fc.c	2006-04-27 12:32:06.000000000 -0500
+++ rc3/drivers/scsi/scsi_transport_fc.c	2006-05-04 10:35:51.276537641 -0500
@@ -1625,6 +1625,7 @@
 	struct fc_rport *rport;
 	unsigned long flags;
 	int match = 0;
+	int unblock = 0;
 
 	/* ensure any stgt delete functions are done */
 	fc_flush_work(shost);
@@ -1702,10 +1703,15 @@
 				rport->flags &= ~FC_RPORT_DEVLOSS_PENDING;
 
 				/* initiate a scan of the target */
-				rport->flags |= FC_RPORT_SCAN_PENDING;
-				scsi_queue_work(shost, &rport->scan_work);
-
+				if (rport->flags & FC_RPORT_SCAN_PENDING)
+					unblock = 1;
+				else {
+					rport->flags |= FC_RPORT_SCAN_PENDING;
+					scsi_queue_work(shost, &rport->scan_work);
+				}
 				spin_unlock_irqrestore(shost->host_lock, flags);
+				if (unblock)
+					scsi_target_unblock(&rport->dev);
 
 				return rport;
 			}
@@ -1760,11 +1766,17 @@
 
 			if (rport->roles & FC_RPORT_ROLE_FCP_TARGET) {
 				/* initiate a scan of the target */
-				rport->flags |= FC_RPORT_SCAN_PENDING;
-				scsi_queue_work(shost, &rport->scan_work);
+				if (rport->flags & FC_RPORT_SCAN_PENDING)
+					unblock = 1;
+				else {
+					rport->flags |= FC_RPORT_SCAN_PENDING;
+					scsi_queue_work(shost, &rport->scan_work);
+				}
 			}
 
 			spin_unlock_irqrestore(shost->host_lock, flags);
+			if (unblock)
+				scsi_target_unblock(&rport->dev);
 
 			return rport;
 		}
@@ -1859,7 +1871,6 @@
 	rport->port_state = FC_PORTSTATE_BLOCKED;
 
 	rport->flags |= FC_RPORT_DEVLOSS_PENDING;
-
 	spin_unlock_irqrestore(shost->host_lock, flags);
 
 	scsi_target_block(&rport->dev);
@@ -1896,6 +1907,7 @@
 	struct fc_host_attrs *fc_host = shost_to_fc_host(shost);
 	unsigned long flags;
 	int create = 0;
+	int unblock = 0;
 
 	spin_lock_irqsave(shost->host_lock, flags);
 	if (roles & FC_RPORT_ROLE_FCP_TARGET) {
@@ -1935,9 +1947,15 @@
 
 		/* initiate a scan of the target */
 		spin_lock_irqsave(shost->host_lock, flags);
-		rport->flags |= FC_RPORT_SCAN_PENDING;
-		scsi_queue_work(shost, &rport->scan_work);
+		if (rport->flags & FC_RPORT_SCAN_PENDING)
+			unblock = 1;
+		else {
+			rport->flags |= FC_RPORT_SCAN_PENDING;
+			scsi_queue_work(shost, &rport->scan_work);
+		}
 		spin_unlock_irqrestore(shost->host_lock, flags);
+		if (unblock)
+			scsi_target_unblock(&rport->dev);
 	}
 }
 EXPORT_SYMBOL(fc_remote_port_rolechg);

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux