Re: [PATCH v4 2/2] ufs: core: requeue aborted request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2024-09-18 at 11:29 -0700, Bart Van Assche wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On 9/18/24 6:29 AM, Peter Wang (王信友) wrote:
> > Basically, this patch currently only needs to handle requeueing
> > for the error handler abort.
> > The approach for DBR mode and MCQ mode should be consistent.
> > If receive an interrupt response (OCS:ABORTED or
> INVALID_OCS_VALUE),
> > then set DID_REQUEUE. If there is no interrupt, it will also set
> > SCSI DID_REQUEUE in ufshcd_err_handler through
> > ufshcd_complete_requests
> > with force_compl = true.
> 
> Reporting a completion for commands cleared by writing into the
> legacy
> UTRLCLR register is not compliant with any version of the UFSHCI
> standard. Reporting a completion for commands cleared by writing into
> that register is problematic because it causes
> ufshcd_release_scsi_cmd()
> to be called as follows:
> 
> ufshcd_sl_intr()
>    ufshcd_transfer_req_compl()
>      ufshcd_poll()
>        __ufshcd_transfer_req_compl()
>          ufshcd_compl_one_cqe()
>            cmd->result = ...
>            ufshcd_release_scsi_cmd()
>            scsi_done()
> 
> Calling ufshcd_release_scsi_cmd() if a command has been cleared is
> problematic because the SCSI core does not expect this. If 
> ufshcd_try_to_abort_task() clears a SCSI command, 
> ufshcd_release_scsi_cmd() must not be called until the SCSI core
> decides to release the command. This is why I wrote in a previous
> mail
> that I think that a quirk should be introduced to suppress the
> completions generated by clearing a SCSI command.
> 

Hi Bart,

I'm not sure if I'm misunderstanding your point, but I feel that
ufshcd_release_scsi_cmd should always be called. It's scsi_done 
that shouldn't be called, as it should be left to the SCSI layer 
to decide how to handle this command. 
Because ufshcd_release_scsi_cmd is just about releasing resources 
related to ufshcd_map_sg and the clock at the UFS driver level. 
scsi_done is what notifies the SCSI layer that the cmd has finished, 
asking it to look at the result to decide how to proceed.


> > The more problematic part is with MCQ mode. To imitate the DBR
> > approach, we just need to set DID_REQUEUE upon receiving an
> interrupt.
> > Everything else remains the same. This would make things simpler.
> > 
> > Moving forward, if we want to simplify things and we have also
> > taken stock of the two or three scenarios where OCS: ABORTED
> occurs,
> > do we even need a flag? Couldn't we just set DID_REQUEUE directly
> > for OCS: ABORTED?
> > What do you think?
> 
> How about making ufshcd_compl_one_cqe() skip entries with status
> OCS_ABORTED? That would make ufshcd_compl_one_cqe() behave as the
> SCSI core expects, namely not freeing any command resources if a
> SCSI command is aborted successfully.
> 
> This approach may require further changes to ufshcd_abort_all().
> In that function there are separate code paths for legacy and MCQ
> mode. This is less than ideal. Would it be possible to combine
> these code paths by removing the ufshcd_complete_requests() call
> from ufshcd_abort_all() and by handling completions from inside
> ufshcd_abort_one()?
> 
> Thanks,
> 
> Bart.

The four case flows for abort are as follows:
----------------------------------------------------------------

Case1: DBR ufshcd_abort

In this case, you can see that ufshcd_release_scsi_cmd will 
definitely be called.

ufshcd_abort()
  ufshcd_try_to_abort_task()		// It should trigger an
interrupt, but the tensor might not
  get outstanding_lock
  clear outstanding_reqs tag
  ufshcd_release_scsi_cmd()
  release outstanding_lock

ufshcd_intr()
  ufshcd_sl_intr()
    ufshcd_transfer_req_compl()
      ufshcd_poll()
        get outstanding_lock
        clear outstanding_reqs tag
        release outstanding_lock			
        __ufshcd_transfer_req_compl()
          ufshcd_compl_one_cqe()
          cmd->result = DID_REQUEUE	// mediatek may need quirk
change DID_ABORT to DID_REQUEUE
          ufshcd_release_scsi_cmd()
          scsi_done();

In most cases, ufshcd_intr will not reach scsi_done because the 
outstanding_reqs tag is cleared by the original thread. 
Therefore, whether there is an interrupt or not doesn't affect 
the result because the ISR will do nothing in most cases. 

In a very low chance, the ISR will reach scsi_done and notify 
SCSI to requeue, and the original thread will not 
call ufshcd_release_scsi_cmd.
MediaTek may need to change DID_ABORT to DID_REQUEUE in this 
situation, or perhaps not handle this ISR at all.

----------------------------------------------------------------

Case2: MCQ ufshcd_abort

In the case of MCQ ufshcd_abort, you can also see that 
ufshcd_release_scsi_cmd will definitely be called too. 
However, there seems to be a problem here, as 
ufshcd_release_scsi_cmd might be called twice. 
This is because cmd is not null in ufshcd_release_scsi_cmd, 
which the previous version would set cmd to null. 
Skipping OCS: ABORTED in ufshcd_compl_one_cqe indeed 
can avoid this problem. This part needs further 
consideration on how to handle it.

ufshcd_abort()
  ufshcd_mcq_abort()
    ufshcd_try_to_abort_task()	// will trigger ISR 
    ufshcd_release_scsi_cmd()

ufs_mtk_mcq_intr()
  ufshcd_mcq_poll_cqe_lock()
    ufshcd_mcq_process_cqe()
      ufshcd_compl_one_cqe()
        cmd->result = DID_ABORT
        ufshcd_release_scsi_cmd() // will release twice
        scsi_done()

----------------------------------------------------------------

Case3: DBR ufshcd_err_handler

In the case of the DBR mode error handler, it's the same; 
ufshcd_release_scsi_cmd will also be executed, and scsi_done 
will definitely be used to notify SCSI to requeue.

ufshcd_err_handler()
  ufshcd_abort_all()
    ufshcd_abort_one()
      ufshcd_try_to_abort_task()	// It should trigger an
interrupt, but the tensor might not
    ufshcd_complete_requests()
      ufshcd_transfer_req_compl()
        ufshcd_poll()
          get outstanding_lock
          clear outstanding_reqs tag
          release outstanding_lock	
          __ufshcd_transfer_req_compl()
            ufshcd_compl_one_cqe()
              cmd->result = DID_REQUEUE // mediatek may need quirk
change DID_ABORT to DID_REQUEUE
              ufshcd_release_scsi_cmd()
              scsi_done()

ufshcd_intr()
  ufshcd_sl_intr()
    ufshcd_transfer_req_compl()
      ufshcd_poll()
        get outstanding_lock
        clear outstanding_reqs tag
        release outstanding_lock			
        __ufshcd_transfer_req_compl()
          ufshcd_compl_one_cqe()
          cmd->result = DID_REQUEUE // mediatek may need quirk change
DID_ABORT to DID_REQUEUE
          ufshcd_release_scsi_cmd()
          scsi_done();

At this time, the same actions are taken regardless of whether 
there is an ISR, and with the protection of outstanding_lock, 
only one thread will execute ufshcd_release_scsi_cmd and scsi_done.

----------------------------------------------------------------

Case4: MCQ ufshcd_err_handler

It's the same with MCQ mode; there is protection from the cqe lock, 
so only one thread will execute. What my patch 2 aims to do is to 
change DID_ABORT to DID_REQUEUE in this situation.

ufshcd_err_handler()
  ufshcd_abort_all()
    ufshcd_abort_one()
      ufshcd_try_to_abort_task()	// will trigger irq thread
    ufshcd_complete_requests()
      ufshcd_mcq_compl_pending_transfer()
        ufshcd_mcq_poll_cqe_lock()
          ufshcd_mcq_process_cqe()
            ufshcd_compl_one_cqe()
              cmd->result = DID_ABORT // should change to DID_REQUEUE
              ufshcd_release_scsi_cmd()
              scsi_done()

ufs_mtk_mcq_intr()
  ufshcd_mcq_poll_cqe_lock()
    ufshcd_mcq_process_cqe()
      ufshcd_compl_one_cqe()
        cmd->result = DID_ABORT  // should change to DID_REQUEUE
        ufshcd_release_scsi_cmd()
        scsi_done()

Thanks
Peter






[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux