On Fri, 2023-09-15 at 22:21 +0200, Martin Wilck wrote: > On Tue, 2023-09-05 at 18:15 -0500, Mike Christie wrote: > > This has read_capacity_16 have scsi-ml retry errors instead of > > driving > > them itself. > > > > There are 2 behavior changes with this patch: > > 1. There is one behavior change where we no longer retry when > > scsi_execute_cmd returns < 0, but we should be ok. We don't need to > > retry > > for failures like the queue being removed, and for the case where > > there > > are no tags/reqs since the block layer waits/retries for us. For > > possible > > memory allocation failures from blk_rq_map_kern we use GFP_NOIO, so > > retrying will probably not help. > > 2. For the specific UAs we checked for and retried, we would get > > READ_CAPACITY_RETRIES_ON_RESET retries plus whatever retries were > > left > > from the loop's retries. Each UA now gets > > READ_CAPACITY_RETRIES_ON_RESET > > reties, and the other errors (not including medium not present) get > > up > > to 3 retries. > > This is ok, but - just a thought - would it make sense to add a field > for maximum total retry count (summed over all failures)? That would > allow us to minimize behavior changes also in other cases. Could we perhaps use scmd->allowed for this purpose? I noted that several callers of scsi_execute_cmd() in your patch set set the allowed parameter, e.g. to sdkp->max_retries in 07/34. But allowed doesn't seem to be used at all in the passthrough case, so we might as well use it as an upper limit for the total number of retries.