> On Jun 5, 2020, at 9:44 AM, Roman Bolshakov <r.bolshakov@xxxxxxxxx> wrote: > > The driver performs SCR (state change registration) in all modes > including pure target mode. > > For each RSCN, scan_needed flag is set in qla2x00_handle_rscn() for the > port mentioned in the RSCN and fabric rescan is scheduled. During the > rescan, GNN_FT handler, qla24xx_async_gnnft_done() deletes session of > the port that caused the RSCN. > > In target mode, the session deletion has an impact on ATIO handler, > qlt_24xx_atio_pkt(). Target responds with SAM STATUS BUSY to I/O > incoming from the deleted session. qlt_handle_cmd_for_atio() and > qlt_handle_task_mgmt() return -EFAULT if they are not able to find > session of the command/TMF, and that results in invocation of > qlt_send_busy(): > > qlt_24xx_atio_pkt_all_vps: qla_target(0): type 6 ox_id 0014 > qla_target(0): Unable to send command to target, sending BUSY status > > Such response causes command timeout on the initiator. Error handler > thread on the initiator will be spawned to abort the commands: > > scsi 23:0:0:0: tag#0 abort scheduled > scsi 23:0:0:0: tag#0 aborting command > qla2xxx [0000:af:00.0]-188c:23: Entered qla24xx_abort_command. > qla2xxx [0000:af:00.0]-801c:23: Abort command issued nexus=23:0:0 -- 0 2003. > > Command abort is rejected by target and fails (2003), error handler then > tries to perform DEVICE RESET and TARGET RESET but they're also doomed > to fail because TMFs are ignored for the deleted sessions. > > Then initiator makes BUS RESET that resets the link via > qla2x00_full_login_lip(). BUS RESET succeeds and brings initiator port > up, SAN switch detects that and sends RSCN to the target port and it > fails again the same way as described above. It never goes out of the > loop. > > The change breaks the RSCN loop by keeping initiator sessions mentioned > in RSCN payload in all modes, including dual and pure target mode. > > Fixes: 2037ce49d30a ("scsi: qla2xxx: Fix stale session") > Cc: Quinn Tran <qutran@xxxxxxxxxxx> > Cc: Arun Easi <aeasi@xxxxxxxxxxx> > Cc: Nilesh Javali <njavali@xxxxxxxxxxx> > Cc: Bart Van Assche <bvanassche@xxxxxxx> > Cc: Daniel Wagner <dwagner@xxxxxxx> > Cc: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx> > Cc: Martin Wilck <mwilck@xxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx # v5.4+ > Signed-off-by: Roman Bolshakov <r.bolshakov@xxxxxxxxx> > --- > drivers/scsi/qla2xxx/qla_gs.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > Changes since v1: > - Corrected an error when N_Port_ID change wouldn't clean up stale > session (Martin W.). > > N_Port_ID may change in the switched fabric topology if initiator > cable is replugged to another physical port on the SAN switch (some > fabrics assign physical port number to domain area). Physical > reconnection implies that initiator is going to relogin anyway and > previous session is no longer needed. > > diff --git a/drivers/scsi/qla2xxx/qla_gs.c b/drivers/scsi/qla2xxx/qla_gs.c > index 42c3ad27f1cb..df670fba2ab8 100644 > --- a/drivers/scsi/qla2xxx/qla_gs.c > +++ b/drivers/scsi/qla2xxx/qla_gs.c > @@ -3496,7 +3496,9 @@ void qla24xx_async_gnnft_done(scsi_qla_host_t *vha, srb_t *sp) > qla2x00_clear_loop_id(fcport); > fcport->flags |= FCF_FABRIC_DEVICE; > } else if (fcport->d_id.b24 != rp->id.b24 || > - fcport->scan_needed) { > + (fcport->scan_needed && > + fcport->port_type != FCT_INITIATOR && > + fcport->port_type != FCT_NVME_INITIATOR)) { > qlt_schedule_sess_for_deletion(fcport); > } > fcport->d_id.b24 = rp->id.b24; > -- > 2.26.1 > Looks fine. Reviewed-by: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx> -- Himanshu Madhani Oracle Linux Engineering