Re: [PATCH 00/11] First pass at merging Bart's HA work

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 30, 2012 at 4:21 AM, David Dillow <dillowda@xxxxxxxx> wrote:
[...]
> Modulo a few style issues (braces around one line if branches, etc.) and
> having three state variables vs one, I can live with everything up to
> aabfa852acd27962 at git://github.com/bvanassche/linux.git#srp-ha. Those
> two are small things that can be fixed later and are not worth holding
> things up any further.
>
> I'll try to spend some time on the final four patches tomorrow afternoon.

Dave, Bart

My colleague Alex Turin <alextu@xxxxxxxxxxxx> tried  today the bits as
they appear in Roland's kernel.org tree / for-next branch up to commit
 fb57e1dbbd4 and here's some feedback

Basically, what he did was connecting  to a target, next take down the
IB port on the initiator side, and issue some IOs (dd if=/dev/sdb
of=/dev/null count=1)

Our recollection of events from the logs (below) is the following

1. queued command get completion status 5

2. as part of error handling srp_reset_host() was called,

3. srp_reset_host() calls to srp_reconnect_target() which fails cause
port is down.

4. srp_reconnect_target() on failure calls to srp_queue_remove_work()
which sets
target->status to SRP_TARGET_REMOVED.

5.srp_reset_host() called second time. it calls to
srp_reconnect_target() but target->state == SRP_TARGET_REMOVED.
srp_reconnect_target() checks if target->state != SRP_TARGET_LIVE and
return -EAGAIN.

This probably means that even after enabling port it will still fail
to reconnect?

Or.


Dec  5 16:19:13 rsws42 kernel: scsi host7: ib_srp: failed send status 5
Dec  5 16:19:42 rsws42 kernel: scsi host7: SRP abort called
Dec  5 16:19:42 rsws42 kernel: scsi host7: SRP reset_device called
Dec  5 16:19:42 rsws42 kernel: scsi host7: ib_srp: SRP reset_host called
Dec  5 16:19:43 rsws42 kernel: scsi host7: ib_srp: Got failed path rec
status -110
Dec  5 16:19:43 rsws42 kernel: scsi host7: ib_srp: Path record query failed
Dec  5 16:19:43 rsws42 kernel: scsi host7: ib_srp: reconnect failed
(-110), removing target port.
Dec  5 16:19:43 rsws42 kernel: sd 7:0:0:11: Device offlined - not
ready after error recovery
Dec  5 16:19:43 rsws42 kernel: sd 7:0:0:11: [sdb] Synchronizing SCSI cache
Dec  5 16:20:45 rsws42 kernel: scsi host7: SRP abort called
Dec  5 16:20:50 rsws42 kernel: scsi host7: SRP abort called
Dec  5 16:21:05 rsws42 kernel: scsi host7: SRP abort called
Dec  5 16:21:10 rsws42 kernel: scsi host7: SRP reset_device called
Dec  5 16:21:15 rsws42 kernel: scsi host7: ib_srp: SRP reset_host called
Dec  5 16:21:15 rsws42 kernel: sd 7:0:0:11: Device offlined - not
ready after error recovery
Dec  5 16:21:15 rsws42 kernel: sd 7:0:0:11: Device offlined - not
ready after error recovery

repeating part:

Dec  5 16:22:17 rsws42 kernel: scsi host7: SRP abort called
Dec  5 16:22:22 rsws42 kernel: scsi host7: SRP abort called
Dec  5 16:22:37 rsws42 kernel: scsi host7: SRP abort called
Dec  5 16:22:42 rsws42 kernel: scsi host7: SRP reset_device called
Dec  5 16:22:47 rsws42 kernel: scsi host7: ib_srp: SRP reset_host called
Dec  5 16:22:47 rsws42 kernel: sd 7:0:0:11: Device offlined - not
ready after error recovery
Dec  5 16:22:47 rsws42 kernel: sd 7:0:0:11: Device offlined - not
ready after error recovery
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux