Hi Avri, On Mon, 2019-07-22 at 06:10 +0000, Avri Altman wrote: > > > > > > > > Hi, > > > > > > > > > > > Currently bits in hba->outstanding_tasks are cleared only after their > > > > corresponding task management commands are successfully done by > > > > __ufshcd_issue_tm_cmd(). > > > > > > > > If timeout happens in a task management command, its corresponding > > > > bit in hba->outstanding_tasks will not be cleared until next task > > > > management command with the same tag used successfully finishes.‧ > > > ufshcd_clear_tm_cmd is also called as part of ufshcd_err_handler. > > > Does this change something in your assumptions? > > And BTW there is a specific __clear_bit in __ufshcd_issue_tm_cmd() in case > > of a TO. > > Gave it another look - > If indeed this bit isn't cleared as part of the error flow that the timeout triggers, > I think you should relate to ufshcd_clear_tm_cmd specifically in your commit log - > Because this is the obvious place where the bit cleanup should take place. > > Also the fix should be much more intuitive IMO - > Today we do __clear_bit() on success, ufshcd_clear_tm_cmd() on error, > And also ufshcd_put_tm_slot() either way? > > Maybe you can choose a single place to clear it, without any additional code? ufshcd_clear_tm_cmd() is similar to ufshcd_clear_cmd() which tries to write timed-out bit in "clear register". They do not clean bits in both outstanding masks. Another reason to not to do outstanding bits cleanup in ufshcd_clear_tm_cmd() is: if host is unable to clear TM command by setting "clear register", the TM command may be still "outstanding" in the device. In this case, it may be better to do cleanup after reset is done. Cleanup includes bits in hba->outstanding_tasks and hba->tm_slots_in_use which is possibly cleaned too early by ufshcd_put_tm_slot() if command is timed-out now. Referring to error handling flow of hba->outstanding_reqs, all timed-out bits will be cleaned by ufshcd_reset_and_restore() => ufshcd_transfer_req_compl() after reset is done. Similar handling for hba->outstanding_tasks could be applied, i.e., do cleanup by ufshcd_reset_and_restore() => ufshcd_tmc_handler(). The next thing is what you suggested: How to make the fix more intuitive. Patchset v2 will be uploaded for review: It tries to "re-factor" cleanup jobs first, and then add fixed flow to make the whole patch more readable. One more thing, above description and flow is done through UFSHCD SCSI error handling routines registered with SCSI Midlayer. For TM command timeout happening in bsg path without error handling triggered by SCSI layer, we may need to consider to handle those tasks in future patches. > > Thanks, > Avri Thanks, Stanley > _______________________________________________ > Linux-mediatek mailing list > Linux-mediatek@xxxxxxxxxxxxxxxxxxx > http://lists.infradead.org/mailman/listinfo/linux-mediatek