Re: [PATCH v3] scsi: ufs: Cleanup completed request without interrupt notification

Stanley Chu <stanley.chu@xxxxxxxxxxxx> · Wed, 22 Jul 2020 18:07:37 +0800

Hi Bart, Avri,

On Tue, 2020-07-14 at 21:00 -0700, Bart Van Assche wrote:
> On 2020-07-13 01:10, Avri Altman wrote:
> > Artificially injecting errors is a very common validation mechanism,
> > Provided that you are not breaking anything of the upper-layers,
> > Which I don't think you are doing.
> 

As the concerns of below questions,

"scsi timeout is 30sec - do you expect an interrupt to arrive after
that?"

Actually in my test scenario, the flow works well without re-checking
"outstanding_reqs" in "cleanup" section in ufshcd_abort(), so I would
remove this checking first and resend this fix (with refined commit
message according to blk-mq, not legacy blk). Please let me know if you
have any suggestions.

> Hi Avri,
> 
> My concern is that the code that is being added in the abort handler
> sooner or later will evolve into a duplicate of the regular completion
> path. Wouldn't it be better to poll for completions from the timeout
> handler by calling ufshcd_transfer_req_compl() instead of duplicating
> that function?
> 

The duplicated calls of cleanup job would be as below,

scsi_dma_unmap(cmd);
hba->lrb[tag].cmd = NULL;
ufshcd_outstanding_req_clear(hba, tag);

As your suggestions, above calls could be re-factored but the third call
in __ufshcd_transfer_req_compl() would be more efficient by

hba->outstanding_reqs ^= completed_reqs;

for all handled requests in interrupt handler.

Here we could not directly use "ufshcd_transfer_req_compl()" or its
inner function "__ufshcd_transfer_req_compl()" since at least
scsi_done() is not required in ufshcd_abort() because the completion
flow will be handled by SCSI error handler, not ufshcd_abort() itself.

> >>> In section 7.2.3 of the UFS specification I found the following about how
> >>> to process request completions: "Software determines if new TRs have
> >>> completed since step #2, by repeating one of the two methods described in
> >>> step #2. If new TRs have completed, software repeats the sequence from
> >>> step #3." Is such a loop perhaps missing from the Linux UFS driver?
> >
> > Could not find that citation.
> > What version of the spec are you using?
> 
> That quote comes from the following document: "Universal Flash Storage
> Host Controller Interface (UFSHCI); Version 2.1; JESD223C; (Revision of
> JESD223B, September 2013); MARCH 2016".

Above description has already be implemented in ufshcd_intr() and
ufshcd_transfer_req_compl(). But this loop cannot save "missing
interrupt" just like this injected error case.

Thanks,
Stanley Chu