Sorry for the non plain text format. On Mon, Apr 8, 2024 at 8:30 PM John Garry <john.g.garry@xxxxxxxxxx> wrote: > > On 08/04/2024 11:05, Lei Chen wrote: > > When an scmd times out, block layer calls megasas_reset_timer to > > make further decisions. scmd_timeout indicates when an scmd is really > > timed-out. > > What does really timed-out mean? scsi_times_out will call eh_timed_out (in megaraid driver, this indicates megasas_reset_timer), megasas_reset_timer determines whether a scmd is timed out. If not, it will return BLK_EH_RESET_TIMER to tell the block layer to reset the timer and do nothing. > > > > If we want to make this process more fast, we can decrease > > this value. This patch allows users to change this value in run-time. > > > > Signed-off-by: Lei Chen <lei.chen@xxxxxxxxxx> > > --- > > drivers/scsi/megaraid/megaraid_sas_base.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c > > index 3d4f13da1ae8..2a165e5dc7a3 100644 > > --- a/drivers/scsi/megaraid/megaraid_sas_base.c > > +++ b/drivers/scsi/megaraid/megaraid_sas_base.c > > @@ -91,7 +91,7 @@ module_param(dual_qdepth_disable, int, 0444); > > MODULE_PARM_DESC(dual_qdepth_disable, "Disable dual queue depth feature. Default: 0"); > > > > static unsigned int scmd_timeout = MEGASAS_DEFAULT_CMD_TIMEOUT; > > -module_param(scmd_timeout, int, 0444); > > +module_param(scmd_timeout, int, 0644); > > MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10-90s), default 90s. See megasas_reset_timer."); > > > > int perf_mode = -1; > > I don't know why megaraid_sas has special handling here (and bypasses > SCSI midlayer). > > If the host is overloaded and you get a time-out as a command simply > could not be handled in time, can you alternatively try reducing the > scsi device queue depth? Yeah, scsi layer and drivers already have some methods to control the queue depth. For megaraid driver, it will throttle queue depth in megasas_reset_timer. But since scsi disks on the same megaraid card share the queue depth, that will impact other scsi disks. In most cases, a scsi disk is more likely to be misworking than a RAID card, which makes scmd wrong and retry. We want to adjust scmd_timeout without reloading the driver to make scmds against abnormal scsi disks completed faster.