Allow to set the default dev_loss_tmo value as kernel module option. Cc: Nilesh Javali <njavali@xxxxxxxxxxx> Cc: Arun Easi <aeasi@xxxxxxxxxxx> Signed-off-by: Daniel Wagner <dwagner@xxxxxxx> --- Hi, During array upgrade tests with NVMe/FC on systems equiped with QLogic HBAs we faced the problem with the default setting of dev_loss_tmo. When the default timeout hit after 60 seconds the file system went into read only mode. The fix was to set the dev_loss_tmo to infinity (note this patch can't handle this). For lpfc devices we could use the sysfs interface under fc_remote_ports which exposed the dev_loss_tmo for SCSI and NVMe rports. The QLogic only expose the rports via fc_remote_ports if SCSI is used. There is the debugfs interface to set the dev_loss_tmo but this has two issues. First, it's not watched by udevd hence no rules work. This could be somehow worked around by setting it statically, but that is really only an option for testing. Even if the debugfs interface is used there is a bug in the code. In qla_nvme_register_remote() the value 0 is assigned to dev_loss_tmo and the NVMe core will use it's default value 60 (this code path is exercised if the rport droppes twice). Anyway, this patch is just to get the discussion going. Maybe the driver could implement the fc_remote_port interface? Hannes was pointing out it might make sense to think about an controller sysfs API as there is already a host and the NVMe protocol is all about host and controller. Thanks, Daniel drivers/scsi/qla2xxx/qla_attr.c | 4 ++-- drivers/scsi/qla2xxx/qla_gbl.h | 1 + drivers/scsi/qla2xxx/qla_nvme.c | 2 +- drivers/scsi/qla2xxx/qla_os.c | 5 +++++ 4 files changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c index 3aa9869f6fae..0d2386ba65c0 100644 --- a/drivers/scsi/qla2xxx/qla_attr.c +++ b/drivers/scsi/qla2xxx/qla_attr.c @@ -3036,7 +3036,7 @@ qla24xx_vport_create(struct fc_vport *fc_vport, bool disable) } /* initialize attributes */ - fc_host_dev_loss_tmo(vha->host) = ha->port_down_retry_count; + fc_host_dev_loss_tmo(vha->host) = ql2xdev_loss_tmo; fc_host_node_name(vha->host) = wwn_to_u64(vha->node_name); fc_host_port_name(vha->host) = wwn_to_u64(vha->port_name); fc_host_supported_classes(vha->host) = @@ -3260,7 +3260,7 @@ qla2x00_init_host_attr(scsi_qla_host_t *vha) struct qla_hw_data *ha = vha->hw; u32 speeds = FC_PORTSPEED_UNKNOWN; - fc_host_dev_loss_tmo(vha->host) = ha->port_down_retry_count; + fc_host_dev_loss_tmo(vha->host) = ql2xdev_loss_tmo; fc_host_node_name(vha->host) = wwn_to_u64(vha->node_name); fc_host_port_name(vha->host) = wwn_to_u64(vha->port_name); fc_host_supported_classes(vha->host) = ha->base_qpair->enable_class_2 ? diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h index fae5cae6f0a8..0b9c24475711 100644 --- a/drivers/scsi/qla2xxx/qla_gbl.h +++ b/drivers/scsi/qla2xxx/qla_gbl.h @@ -178,6 +178,7 @@ extern int ql2xdifbundlinginternalbuffers; extern int ql2xfulldump_on_mpifail; extern int ql2xenforce_iocb_limit; extern int ql2xabts_wait_nvme; +extern int ql2xdev_loss_tmo; extern int qla2x00_loop_reset(scsi_qla_host_t *); extern void qla2x00_abort_all_cmds(scsi_qla_host_t *, int); diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c index 0cacb667a88b..cdc5b5075407 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -41,7 +41,7 @@ int qla_nvme_register_remote(struct scsi_qla_host *vha, struct fc_port *fcport) req.port_name = wwn_to_u64(fcport->port_name); req.node_name = wwn_to_u64(fcport->node_name); req.port_role = 0; - req.dev_loss_tmo = 0; + req.dev_loss_tmo = ql2xdev_loss_tmo; if (fcport->nvme_prli_service_param & NVME_PRLI_SP_INITIATOR) req.port_role = FC_PORT_ROLE_NVME_INITIATOR; diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index d74c32f84ef5..c686522ff64e 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -338,6 +338,11 @@ static void qla2x00_free_device(scsi_qla_host_t *); static int qla2xxx_map_queues(struct Scsi_Host *shost); static void qla2x00_destroy_deferred_work(struct qla_hw_data *); +int ql2xdev_loss_tmo = 60; +module_param(ql2xdev_loss_tmo, int, 0444); +MODULE_PARM_DESC(ql2xdev_loss_tmo, + "Time to wait for device to recover before reporting\n" + "an error. Default is 60 seconds\n"); static struct scsi_transport_template *qla2xxx_transport_template = NULL; struct scsi_transport_template *qla2xxx_transport_vport_template = NULL; -- 2.29.2