On 04/30/15 11:44, Sagi Grimberg wrote:
On 4/30/2015 11:57 AM, Bart Van Assche wrote:
Avoid that srp_terminate_io() can get invoked while srp_queuecommand()
is in progress. This patch avoids that an I/O timeout can trigger the
following kernel warning:
WARNING: at drivers/infiniband/ulp/srp/ib_srp.c:1447
srp_terminate_io+0xef/0x100 [ib_srp]()
Call Trace:
[<ffffffff814c65a2>] dump_stack+0x4e/0x68
[<ffffffff81051f71>] warn_slowpath_common+0x81/0xa0
[<ffffffff8105204a>] warn_slowpath_null+0x1a/0x20
[<ffffffffa075f51f>] srp_terminate_io+0xef/0x100 [ib_srp]
[<ffffffffa07495da>] __rport_fail_io_fast+0xba/0xc0
[scsi_transport_srp]
[<ffffffffa0749a90>] rport_fast_io_fail_timedout+0xe0/0xf0
[scsi_transport_srp]
[<ffffffff8106e09b>] process_one_work+0x1db/0x780
[<ffffffff8106e75b>] worker_thread+0x11b/0x450
[<ffffffff81073c64>] kthread+0xe4/0x100
[<ffffffff814cf26c>] ret_from_fork+0x7c/0xb0
See also patch "scsi_transport_srp: Add transport layer error
handling" (commit ID 29c17324803c).
Signed-off-by: Bart Van Assche <bart.vanassche@xxxxxxxxxxx>
Cc: James Bottomley <JBottomley@xxxxxxxx>
Cc: Sagi Grimberg <sagig@xxxxxxxxxxxx>
Cc: Sebastian Parschauer <sebastian.riemer@xxxxxxxxxxxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx> #v3.13
---
drivers/scsi/scsi_transport_srp.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/scsi_transport_srp.c
b/drivers/scsi/scsi_transport_srp.c
index 6ce1c48..4a44337 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -437,8 +437,10 @@ static void __rport_fail_io_fast(struct srp_rport
*rport)
/* Involve the LLD if possible to terminate all I/O on the
rport. */
i = to_srp_internal(shost->transportt);
- if (i->f->terminate_rport_io)
+ if (i->f->terminate_rport_io) {
+ srp_wait_for_queuecommand(shost);
i->f->terminate_rport_io(rport);
+ }
Why not just terminate the inflight IO before unblocking the target?
Sorry but I don't think that would prevent the described race condition.
The call trace in the description of this patch illustrates that
srp_queuecommand() can still be active even after the transport state
has been changed into "offline". Hence if terminate_rport_io() would be
invoked earlier the same race would still exist.
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html