Re: [PATCH v2 2/2] ufs: core: requeue MCQ abort request

Peter Wang (王信友) <peter.wang@xxxxxxxxxxxx> · Mon, 9 Sep 2024 03:40:47 +0000

On Thu, 2024-09-05 at 14:16 -0700, Bart Van Assche wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On 9/1/24 7:18 PM, peter.wang@xxxxxxxxxxxx wrote:
> > diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-
> mcq.c
> > index afd9541f4bd8..abdc55a8b960 100644
> > --- a/drivers/ufs/core/ufs-mcq.c
> > +++ b/drivers/ufs/core/ufs-mcq.c
> > @@ -642,6 +642,7 @@ static bool ufshcd_mcq_sqe_search(struct
> ufs_hba *hba,
> >   match = le64_to_cpu(utrd->command_desc_base_addr) & CQE_UCD_BA;
> >   if (addr == match) {
> >   ufshcd_mcq_nullify_sqe(utrd);
> > +lrbp->host_initiate_abort = true;
> >   ret = true;
> >   goto out;
> >   }
> 
> I think this is wrong. The above code is only executed if the SCSI
> core
> decides to abort a SCSI command. It is up to the SCSI core to decide
> whether or not to retry an aborted command.
> 

Hi Bart,

This is eh_abort_handler call flow for scsi err handler.
If abort is trigger because error, should't we do retry?
Anyway, I think this case could not happen because if scsi
timeout happen (30s), host hw should not keep cmd in SQ such 
a long time. But once it happen, ufshcd_mcq_sqe_search return 
true and scsi got eh_abort_handler fail. So, I think in this 
case, notify scsi retry this command is properly.

> > -/* Release cmd in MCQ mode if abort succeeds */
> > -if (hba->mcq_enabled && (*ret == 0)) {
> > -hwq = ufshcd_mcq_req_to_hwq(hba, scsi_cmd_to_rq(lrbp->cmd));
> > -if (!hwq)
> > -return 0;
> > -spin_lock_irqsave(&hwq->cq_lock, flags);
> > -if (ufshcd_cmd_inflight(lrbp->cmd))
> > -ufshcd_release_scsi_cmd(hba, lrbp);
> > -spin_unlock_irqrestore(&hwq->cq_lock, flags);
> > -}
> > +/* Host will post to CQ with OCS_ABORTED after SQ cleanup */
> > +if (hba->mcq_enabled && (*ret == 0))
> > +lrbp->host_initiate_abort = true;
> 
> I think this code is racy because the UFS host controller may have 
> posted a completion before the "lrbp->host_initiate_abort = true"
> assignment is executed.
> 

Yes, I should add this code before ufshcd_clear_cmd, thanks.

> > + * @host_initiate_abort: Abort flag initiated by host
> 
> What is "Abort flag"? Please consider renaming "host_initiate_abort"
> into "abort_initiated_by_err_handler" since I think that aborted
> commands should only be retried if these have been aborted by
> ufshcd_err_handler().
> 

Okay, but abort_initiated_by_err maybe better because aborted by
ufshcd_err_handler or scsi err handler could happen. 
What do you think?

> Thanks,
> 
> Bart.