During large NPIV port testing, it was sometimes seen that not all vports would log back in to the target device. There are instances when the fabric is slow to respond to a spam of GID_PT requests and as a result the SLI PORT may abort the GID_PT request because the fabric takes so long. lpfc_cmpl_ct_cmd_gid_pt would enter the lpfc_err_lost_link logic and attempt to lpfc_els_flush_rscn, which is fine, but forgets to decrement the gidft_inp counter. This results in a vport->gidft_inp never reaching 0 and never restarting discovery again. Decrement vport->gidft_inp if lpfc_err_lost_link is true for both lpfc_cmpl_ct_cmd_gid_pt and lpfc_cmpl_ct_cmd_gid_ft. Increase logging info during RSCN timeout and lpfc_err_lost_link events. Co-developed-by: Justin Tee <justin.tee@xxxxxxxxxxxx> Signed-off-by: Justin Tee <justin.tee@xxxxxxxxxxxx> Signed-off-by: James Smart <jsmart2021@xxxxxxxxx> --- drivers/scsi/lpfc/lpfc_ct.c | 16 ++++++++++++++-- drivers/scsi/lpfc/lpfc_hbadisc.c | 5 +++-- 2 files changed, 17 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_ct.c b/drivers/scsi/lpfc/lpfc_ct.c index 6cda8ee25d4f..094199d1006a 100644 --- a/drivers/scsi/lpfc/lpfc_ct.c +++ b/drivers/scsi/lpfc/lpfc_ct.c @@ -960,9 +960,15 @@ lpfc_cmpl_ct_cmd_gid_ft(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, } if (lpfc_error_lost_link(ulp_status, ulp_word4)) { lpfc_printf_vlog(vport, KERN_INFO, LOG_DISCOVERY, - "0226 NS query failed due to link event\n"); + "0226 NS query failed due to link event: " + "ulp_status x%x ulp_word4 x%x fc_flag x%x " + "port_state x%x gidft_inp x%x\n", + ulp_status, ulp_word4, vport->fc_flag, + vport->port_state, vport->gidft_inp); if (vport->fc_flag & FC_RSCN_MODE) lpfc_els_flush_rscn(vport); + if (vport->gidft_inp) + vport->gidft_inp--; goto out; } @@ -1177,9 +1183,15 @@ lpfc_cmpl_ct_cmd_gid_pt(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, } if (lpfc_error_lost_link(ulp_status, ulp_word4)) { lpfc_printf_vlog(vport, KERN_INFO, LOG_DISCOVERY, - "4166 NS query failed due to link event\n"); + "4166 NS query failed due to link event: " + "ulp_status x%x ulp_word4 x%x fc_flag x%x " + "port_state x%x gidft_inp x%x\n", + ulp_status, ulp_word4, vport->fc_flag, + vport->port_state, vport->gidft_inp); if (vport->fc_flag & FC_RSCN_MODE) lpfc_els_flush_rscn(vport); + if (vport->gidft_inp) + vport->gidft_inp--; goto out; } diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c index a833a493a3ee..3ab22ac49e49 100644 --- a/drivers/scsi/lpfc/lpfc_hbadisc.c +++ b/drivers/scsi/lpfc/lpfc_hbadisc.c @@ -6355,8 +6355,9 @@ lpfc_disc_timeout_handler(struct lpfc_vport *vport) lpfc_printf_vlog(vport, KERN_ERR, LOG_TRACE_EVENT, "0231 RSCN timeout Data: x%x " - "x%x\n", - vport->fc_ns_retry, LPFC_MAX_NS_RETRY); + "x%x x%x x%x\n", + vport->fc_ns_retry, LPFC_MAX_NS_RETRY, + vport->port_state, vport->gidft_inp); /* Cleanup any outstanding ELS commands */ lpfc_els_flush_cmd(vport); -- 2.26.2