On Thu, 2024-09-19 at 11:49 -0700, Bart Van Assche wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On 9/19/24 5:16 AM, Peter Wang (王信友) wrote: > > The four case flows for abort are as follows: > > ---------------------------------------------------------------- > > > > Case1: DBR ufshcd_abort > > Please follow the terminology from the UFSHCI 4.0 standard and use > the > word "legacy" instead of "DBR". > Hi Bart, Okay, but the current code comments all use 'SDB mode'. Should we just stick with that term? > > In this case, you can see that ufshcd_release_scsi_cmd will > > definitely be called. > > > > ufshcd_abort() > > ufshcd_try_to_abort_task()// It should trigger an > > interrupt, but the tensor might not > > get outstanding_lock > > clear outstanding_reqs tag > > ufshcd_release_scsi_cmd() > > release outstanding_lock > > > > ufshcd_intr() > > ufshcd_sl_intr() > > ufshcd_transfer_req_compl() > > ufshcd_poll() > > get outstanding_lock > > clear outstanding_reqs tag > > release outstanding_lock > > __ufshcd_transfer_req_compl() > > ufshcd_compl_one_cqe() > > cmd->result = DID_REQUEUE// mediatek may need quirk > > change DID_ABORT to DID_REQUEUE > > ufshcd_release_scsi_cmd() > > scsi_done(); > > > > In most cases, ufshcd_intr will not reach scsi_done because the > > outstanding_reqs tag is cleared by the original thread. > > Therefore, whether there is an interrupt or not doesn't affect > > the result because the ISR will do nothing in most cases. > > > > In a very low chance, the ISR will reach scsi_done and notify > > SCSI to requeue, and the original thread will not > > call ufshcd_release_scsi_cmd. > > MediaTek may need to change DID_ABORT to DID_REQUEUE in this > > situation, or perhaps not handle this ISR at all. > > Please modify ufshcd_compl_one_cqe() such that it ignores commands > with status OCS_ABORTED. This will make the UFSHCI driver behave in > the same way for all UFSHCI controllers, whether or not clearing a > command triggers a completion interrupt. > Yes, I am considering how to modify the code here. > > ---------------------------------------------------------------- > > > > Case2: MCQ ufshcd_abort > > > > In the case of MCQ ufshcd_abort, you can also see that > > ufshcd_release_scsi_cmd will definitely be called too. > > However, there seems to be a problem here, as > > ufshcd_release_scsi_cmd might be called twice. > > This is because cmd is not null in ufshcd_release_scsi_cmd, > > which the previous version would set cmd to null. > > Skipping OCS: ABORTED in ufshcd_compl_one_cqe indeed > > can avoid this problem. This part needs further > > consideration on how to handle it. > > > > ufshcd_abort() > > ufshcd_mcq_abort() > > ufshcd_try_to_abort_task()// will trigger ISR > > ufshcd_release_scsi_cmd() > > > > ufs_mtk_mcq_intr() > > ufshcd_mcq_poll_cqe_lock() > > ufshcd_mcq_process_cqe() > > ufshcd_compl_one_cqe() > > cmd->result = DID_ABORT > > ufshcd_release_scsi_cmd() // will release twice > > scsi_done() > > Do you agree that this case can be addressed with the > ufshcd_compl_one_cqe() change proposed above? > Agree. > > ---------------------------------------------------------------- > > > > Case3: DBR ufshcd_err_handler > > > > In the case of the DBR mode error handler, it's the same; > > ufshcd_release_scsi_cmd will also be executed, and scsi_done > > will definitely be used to notify SCSI to requeue. > > > > ufshcd_err_handler() > > ufshcd_abort_all() > > ufshcd_abort_one() > > ufshcd_try_to_abort_task()// It should trigger an > > interrupt, but the tensor might not > > ufshcd_complete_requests() > > ufshcd_transfer_req_compl() > > ufshcd_poll() > > get outstanding_lock > > clear outstanding_reqs tag > > release outstanding_lock > > __ufshcd_transfer_req_compl() > > ufshcd_compl_one_cqe() > > cmd->result = DID_REQUEUE // mediatek may need quirk > > change DID_ABORT to DID_REQUEUE > > ufshcd_release_scsi_cmd() > > scsi_done() > > > > ufshcd_intr() > > ufshcd_sl_intr() > > ufshcd_transfer_req_compl() > > ufshcd_poll() > > get outstanding_lock > > clear outstanding_reqs tag > > release outstanding_lock > > __ufshcd_transfer_req_compl() > > ufshcd_compl_one_cqe() > > cmd->result = DID_REQUEUE // mediatek may need quirk > change > > DID_ABORT to DID_REQUEUE > > ufshcd_release_scsi_cmd() > > scsi_done(); > > > > At this time, the same actions are taken regardless of whether > > there is an ISR, and with the protection of outstanding_lock, > > only one thread will execute ufshcd_release_scsi_cmd and scsi_done. > > ---------------------------------------------------------------- > > > > Case4: MCQ ufshcd_err_handler > > > > It's the same with MCQ mode; there is protection from the cqe lock, > > so only one thread will execute. What my patch 2 aims to do is to > > change DID_ABORT to DID_REQUEUE in this situation. > > > > ufshcd_err_handler() > > ufshcd_abort_all() > > ufshcd_abort_one() > > ufshcd_try_to_abort_task()// will trigger irq thread > > ufshcd_complete_requests() > > ufshcd_mcq_compl_pending_transfer() > > ufshcd_mcq_poll_cqe_lock() > > ufshcd_mcq_process_cqe() > > ufshcd_compl_one_cqe() > > cmd->result = DID_ABORT // should change to > DID_REQUEUE > > ufshcd_release_scsi_cmd() > > scsi_done() > > > > ufs_mtk_mcq_intr() > > ufshcd_mcq_poll_cqe_lock() > > ufshcd_mcq_process_cqe() > > ufshcd_compl_one_cqe() > > cmd->result = DID_ABORT // should change to DID_REQUEUE > > ufshcd_release_scsi_cmd() > > scsi_done() > > For legacy and MCQ mode, I prefer the following behavior for > ufshcd_abort_all(): > * ufshcd_compl_one_cqe() ignores commands with status OCS_ABORTED. > * ufshcd_release_scsi_cmd() is called either by ufshcd_abort_one() or > by ufshcd_abort_all(). > > Do you agree with making the changes proposed above? > > Thank you, This might not work, as SDB mode doesn't ignore OCS: INVALID_OCS_VALUE but rather notifies SCSI to requeue. So what we need to correct is to notify SCSI to requeue when MCQ mode receives OCS: ABORTED as well. Furthermore, ufshcd_compl_one_cqe, whether it comes from ufshcd_abort_all or ISR, does the same thing and is protected by a lock. Therefore, there is no need for special handling specifically for ufshcd_abort_all. After discussing with you, I realized that there are indeed many deficiencies and inconsistencies here that need to be addressed. I will submit a new patch for the content discussed above. Thanks. Peter > > Bar