On Thu, 2024-09-05 at 14:16 -0700, Bart Van Assche wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On 9/1/24 7:18 PM, peter.wang@xxxxxxxxxxxx wrote: > > diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs- > mcq.c > > index afd9541f4bd8..abdc55a8b960 100644 > > --- a/drivers/ufs/core/ufs-mcq.c > > +++ b/drivers/ufs/core/ufs-mcq.c > > @@ -642,6 +642,7 @@ static bool ufshcd_mcq_sqe_search(struct > ufs_hba *hba, > > match = le64_to_cpu(utrd->command_desc_base_addr) & CQE_UCD_BA; > > if (addr == match) { > > ufshcd_mcq_nullify_sqe(utrd); > > +lrbp->host_initiate_abort = true; > > ret = true; > > goto out; > > } > > I think this is wrong. The above code is only executed if the SCSI > core > decides to abort a SCSI command. It is up to the SCSI core to decide > whether or not to retry an aborted command. > Hi Bart, This is eh_abort_handler call flow for scsi err handler. If abort is trigger because error, should't we do retry? Anyway, I think this case could not happen because if scsi timeout happen (30s), host hw should not keep cmd in SQ such a long time. But once it happen, ufshcd_mcq_sqe_search return true and scsi got eh_abort_handler fail. So, I think in this case, notify scsi retry this command is properly. > > -/* Release cmd in MCQ mode if abort succeeds */ > > -if (hba->mcq_enabled && (*ret == 0)) { > > -hwq = ufshcd_mcq_req_to_hwq(hba, scsi_cmd_to_rq(lrbp->cmd)); > > -if (!hwq) > > -return 0; > > -spin_lock_irqsave(&hwq->cq_lock, flags); > > -if (ufshcd_cmd_inflight(lrbp->cmd)) > > -ufshcd_release_scsi_cmd(hba, lrbp); > > -spin_unlock_irqrestore(&hwq->cq_lock, flags); > > -} > > +/* Host will post to CQ with OCS_ABORTED after SQ cleanup */ > > +if (hba->mcq_enabled && (*ret == 0)) > > +lrbp->host_initiate_abort = true; > > I think this code is racy because the UFS host controller may have > posted a completion before the "lrbp->host_initiate_abort = true" > assignment is executed. > Yes, I should add this code before ufshcd_clear_cmd, thanks. > > + * @host_initiate_abort: Abort flag initiated by host > > What is "Abort flag"? Please consider renaming "host_initiate_abort" > into "abort_initiated_by_err_handler" since I think that aborted > commands should only be retried if these have been aborted by > ufshcd_err_handler(). > Okay, but abort_initiated_by_err maybe better because aborted by ufshcd_err_handler or scsi err handler could happen. What do you think? > Thanks, > > Bart.