Re: [PATCH v4 2/2] ufs: core: requeue aborted request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/19/24 5:16 AM, Peter Wang (王信友) wrote:
The four case flows for abort are as follows:
----------------------------------------------------------------

Case1: DBR ufshcd_abort

Please follow the terminology from the UFSHCI 4.0 standard and use the
word "legacy" instead of "DBR".

In this case, you can see that ufshcd_release_scsi_cmd will
definitely be called.

ufshcd_abort()
   ufshcd_try_to_abort_task()		// It should trigger an
interrupt, but the tensor might not
   get outstanding_lock
   clear outstanding_reqs tag
   ufshcd_release_scsi_cmd()
   release outstanding_lock

ufshcd_intr()
   ufshcd_sl_intr()
     ufshcd_transfer_req_compl()
       ufshcd_poll()
         get outstanding_lock
         clear outstanding_reqs tag
         release outstanding_lock			
         __ufshcd_transfer_req_compl()
           ufshcd_compl_one_cqe()
           cmd->result = DID_REQUEUE	// mediatek may need quirk
change DID_ABORT to DID_REQUEUE
           ufshcd_release_scsi_cmd()
           scsi_done();

In most cases, ufshcd_intr will not reach scsi_done because the
outstanding_reqs tag is cleared by the original thread.
Therefore, whether there is an interrupt or not doesn't affect
the result because the ISR will do nothing in most cases.

In a very low chance, the ISR will reach scsi_done and notify
SCSI to requeue, and the original thread will not
call ufshcd_release_scsi_cmd.
MediaTek may need to change DID_ABORT to DID_REQUEUE in this
situation, or perhaps not handle this ISR at all.

Please modify ufshcd_compl_one_cqe() such that it ignores commands
with status OCS_ABORTED. This will make the UFSHCI driver behave in
the same way for all UFSHCI controllers, whether or not clearing a
command triggers a completion interrupt.

----------------------------------------------------------------

Case2: MCQ ufshcd_abort

In the case of MCQ ufshcd_abort, you can also see that
ufshcd_release_scsi_cmd will definitely be called too.
However, there seems to be a problem here, as
ufshcd_release_scsi_cmd might be called twice.
This is because cmd is not null in ufshcd_release_scsi_cmd,
which the previous version would set cmd to null.
Skipping OCS: ABORTED in ufshcd_compl_one_cqe indeed
can avoid this problem. This part needs further
consideration on how to handle it.

ufshcd_abort()
   ufshcd_mcq_abort()
     ufshcd_try_to_abort_task()	// will trigger ISR
     ufshcd_release_scsi_cmd()

ufs_mtk_mcq_intr()
   ufshcd_mcq_poll_cqe_lock()
     ufshcd_mcq_process_cqe()
       ufshcd_compl_one_cqe()
         cmd->result = DID_ABORT
         ufshcd_release_scsi_cmd() // will release twice
         scsi_done()

Do you agree that this case can be addressed with the
ufshcd_compl_one_cqe() change proposed above?

----------------------------------------------------------------

Case3: DBR ufshcd_err_handler

In the case of the DBR mode error handler, it's the same;
ufshcd_release_scsi_cmd will also be executed, and scsi_done
will definitely be used to notify SCSI to requeue.

ufshcd_err_handler()
   ufshcd_abort_all()
     ufshcd_abort_one()
       ufshcd_try_to_abort_task()	// It should trigger an
interrupt, but the tensor might not
     ufshcd_complete_requests()
       ufshcd_transfer_req_compl()
         ufshcd_poll()
           get outstanding_lock
           clear outstanding_reqs tag
           release outstanding_lock	
           __ufshcd_transfer_req_compl()
             ufshcd_compl_one_cqe()
               cmd->result = DID_REQUEUE // mediatek may need quirk
change DID_ABORT to DID_REQUEUE
               ufshcd_release_scsi_cmd()
               scsi_done()

ufshcd_intr()
   ufshcd_sl_intr()
     ufshcd_transfer_req_compl()
       ufshcd_poll()
         get outstanding_lock
         clear outstanding_reqs tag
         release outstanding_lock			
         __ufshcd_transfer_req_compl()
           ufshcd_compl_one_cqe()
           cmd->result = DID_REQUEUE // mediatek may need quirk change
DID_ABORT to DID_REQUEUE
           ufshcd_release_scsi_cmd()
           scsi_done();

At this time, the same actions are taken regardless of whether
there is an ISR, and with the protection of outstanding_lock,
only one thread will execute ufshcd_release_scsi_cmd and scsi_done.
----------------------------------------------------------------

Case4: MCQ ufshcd_err_handler

It's the same with MCQ mode; there is protection from the cqe lock,
so only one thread will execute. What my patch 2 aims to do is to
change DID_ABORT to DID_REQUEUE in this situation.

ufshcd_err_handler()
   ufshcd_abort_all()
     ufshcd_abort_one()
       ufshcd_try_to_abort_task()	// will trigger irq thread
     ufshcd_complete_requests()
       ufshcd_mcq_compl_pending_transfer()
         ufshcd_mcq_poll_cqe_lock()
           ufshcd_mcq_process_cqe()
             ufshcd_compl_one_cqe()
               cmd->result = DID_ABORT // should change to DID_REQUEUE
               ufshcd_release_scsi_cmd()
               scsi_done()

ufs_mtk_mcq_intr()
   ufshcd_mcq_poll_cqe_lock()
     ufshcd_mcq_process_cqe()
       ufshcd_compl_one_cqe()
         cmd->result = DID_ABORT  // should change to DID_REQUEUE
         ufshcd_release_scsi_cmd()
         scsi_done()

For legacy and MCQ mode, I prefer the following behavior for
ufshcd_abort_all():
* ufshcd_compl_one_cqe() ignores commands with status OCS_ABORTED.
* ufshcd_release_scsi_cmd() is called either by ufshcd_abort_one() or
  by ufshcd_abort_all().

Do you agree with making the changes proposed above?

Thank you,

Bart.





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux