Re: blk_abort_queue on failed paths?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



adding linux-scsi and Mike Anderson

David Strand wrote:
After updating to kernel 2.6.28 I found that when I performed some
cable break testing during device i/o, I would get unwanted device or
host resets. Ultimately I traced it back to this patch:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.29.y.git;a=commit;h=224cb3e981f1b2f9f93dbd49eaef505d17d894c2

The call to blk_abort_queue causes the block layer to call
scsi_times_out for pending i/o, which can (or will) ultimately lead to
device, and/or bus and/or host resets, which of course cause all the
other devices significant disruption.


What driver were you using? I just did a work around for qla4xxx for this (have not posted it yet). I added a scsi_times_out handler to the driver so that if the IO was failed to a transport problem then the eh does not run.

FC drivers already use fc_timed_out, but I think that will not work. The FC driver could fail the IO then call fc_remote_port_delete. So the failed IO could hit dm-mpath.c and that could call into the scsi_times_out (which for fc drivers call into fc_timed_out) but the fc_remote_port_delete has not been done yet, so the port_state is still online so that kicks off the scsi eh.

For transport errors I do not think blk_abort_queue is needed anymore - at least for scsi drivers. For FC almost every driver supports the terminate_rport_io call back (just mptfc does not), so you can set the fast io fail tmo to make sure all IO is failed quickly. For iscsi, we have the replacement/recovery_timeout. And for SAS, I think there is a timeout or the device/target/port is deleted, right?


What was the reason for this change? I searched through my email from
this mailing list and could not find a discussion about it.


It seems like it would only make sense to call blk_abort_queue for maybe some block drivers (does cciss or dasd need it) or maybe for device errors. But it seems to be broken for the common multipath use cases.

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel

[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux