On Fri, Nov 30, 2012 at 4:21 AM, David Dillow <dillowda@xxxxxxxx> wrote: [...] > Modulo a few style issues (braces around one line if branches, etc.) and > having three state variables vs one, I can live with everything up to > aabfa852acd27962 at git://github.com/bvanassche/linux.git#srp-ha. Those > two are small things that can be fixed later and are not worth holding > things up any further. > > I'll try to spend some time on the final four patches tomorrow afternoon. Dave, Bart My colleague Alex Turin <alextu@xxxxxxxxxxxx> tried today the bits as they appear in Roland's kernel.org tree / for-next branch up to commit fb57e1dbbd4 and here's some feedback Basically, what he did was connecting to a target, next take down the IB port on the initiator side, and issue some IOs (dd if=/dev/sdb of=/dev/null count=1) Our recollection of events from the logs (below) is the following 1. queued command get completion status 5 2. as part of error handling srp_reset_host() was called, 3. srp_reset_host() calls to srp_reconnect_target() which fails cause port is down. 4. srp_reconnect_target() on failure calls to srp_queue_remove_work() which sets target->status to SRP_TARGET_REMOVED. 5.srp_reset_host() called second time. it calls to srp_reconnect_target() but target->state == SRP_TARGET_REMOVED. srp_reconnect_target() checks if target->state != SRP_TARGET_LIVE and return -EAGAIN. This probably means that even after enabling port it will still fail to reconnect? Or. Dec 5 16:19:13 rsws42 kernel: scsi host7: ib_srp: failed send status 5 Dec 5 16:19:42 rsws42 kernel: scsi host7: SRP abort called Dec 5 16:19:42 rsws42 kernel: scsi host7: SRP reset_device called Dec 5 16:19:42 rsws42 kernel: scsi host7: ib_srp: SRP reset_host called Dec 5 16:19:43 rsws42 kernel: scsi host7: ib_srp: Got failed path rec status -110 Dec 5 16:19:43 rsws42 kernel: scsi host7: ib_srp: Path record query failed Dec 5 16:19:43 rsws42 kernel: scsi host7: ib_srp: reconnect failed (-110), removing target port. Dec 5 16:19:43 rsws42 kernel: sd 7:0:0:11: Device offlined - not ready after error recovery Dec 5 16:19:43 rsws42 kernel: sd 7:0:0:11: [sdb] Synchronizing SCSI cache Dec 5 16:20:45 rsws42 kernel: scsi host7: SRP abort called Dec 5 16:20:50 rsws42 kernel: scsi host7: SRP abort called Dec 5 16:21:05 rsws42 kernel: scsi host7: SRP abort called Dec 5 16:21:10 rsws42 kernel: scsi host7: SRP reset_device called Dec 5 16:21:15 rsws42 kernel: scsi host7: ib_srp: SRP reset_host called Dec 5 16:21:15 rsws42 kernel: sd 7:0:0:11: Device offlined - not ready after error recovery Dec 5 16:21:15 rsws42 kernel: sd 7:0:0:11: Device offlined - not ready after error recovery repeating part: Dec 5 16:22:17 rsws42 kernel: scsi host7: SRP abort called Dec 5 16:22:22 rsws42 kernel: scsi host7: SRP abort called Dec 5 16:22:37 rsws42 kernel: scsi host7: SRP abort called Dec 5 16:22:42 rsws42 kernel: scsi host7: SRP reset_device called Dec 5 16:22:47 rsws42 kernel: scsi host7: ib_srp: SRP reset_host called Dec 5 16:22:47 rsws42 kernel: sd 7:0:0:11: Device offlined - not ready after error recovery Dec 5 16:22:47 rsws42 kernel: sd 7:0:0:11: Device offlined - not ready after error recovery -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html