> On 10/7/21 12:40 AM, Kiwoong Kim wrote: > > When an scsi command is dispatched right after host complete all the > > pending requests goes to idle and ufs driver tries to ring a doorbell, > > host might be still during entering into hibern8. If an error occurrs > > during that period, the doorbell might not be zero. In this case, > > clearing it should be needed to requeue its command. > > Currently, ufshcd_err_handler goes directly to reset when the driver's > > link state is broken. This patch is to make it clear doorbells in the > > case. In the situation, communication would be disabled, so TM isn't > > necessary or we can say it's not available. > > The above text is too hard to comprehend. Please make it more clear. As an > example, in the first sentence, what does "goes to idle" apply to? > Does it apply to "SCSI command", "pending requests" or something else? My 'goes to idle' means a state of no pended UTP requests. I write 'scsi command' because the symptom that I saw is related with scsi command. > > What mechanism does "If an error occurs" refer to? A SCSI command > processing error, a link error or another type of error? Hibern8 errors written in the title. > > > Here's an actual symptom that I've faced. At the time, tag #17 is > > still pended even after host reset. And then the block timer is > > expired. > > pended -> pending. Got it. > > > exynos-ufs 11100000.ufs: ufshcd_check_errors: Auto Hibern8 Enter > > failed - status: 0x00000040, upmcrs: 0x00000001 .. > > host_regs: 00000050: b8671000 00000008 00020000 00000000 .. > > exynos-ufs 11100000.ufs: ufshcd_abort: Device abort task at tag 17 > > The relationship between the example and the description above the example > is not clear. If a command is pending before the error handler starts, > aborting that command fails and the host is not reset then the command > will still be pending after the error handler has finished. > > > @@ -6138,7 +6139,12 @@ static void ufshcd_err_handler(struct Scsi_Host > *host) > > (hba->saved_uic_err & (UFSHCD_UIC_DL_NAC_RECEIVED_ERROR | > > UFSHCD_UIC_DL_TCx_REPLAY_ERROR)))) { > > needs_reset = true; > > - goto do_reset; > > + spin_lock_irqsave(hba->host->host_lock, flags); > > + if (!hba->ahit && ufshcd_is_link_broken(hba)) > > + link_broken_in_ah8 = true; > > + spin_unlock_irqrestore(hba->host->host_lock, flags); > > + if (!link_broken_in_ah8) > > + goto do_reset; > > } > > > > My understanding is that hba->ahit == 0 means that auto-hibernation is > disabled. Hence, the above code sets 'link_broken_in_ah8' only if auto- > hibernation is disabled. So what does the name of the variable > 'link_broken_in_ah8' mean? Mistake. And while I'm commenting, I get a better idea and will revise this patch. > > > @@ -6168,6 +6174,9 @@ static void ufshcd_err_handler(struct Scsi_Host > *host) > > } > > } > > > > + if (link_broken_in_ah8) > > + goto lock_skip_pending_xfer_clear; > > + > > /* Clear pending task management requests */ > > for_each_set_bit(tag, &hba->outstanding_tasks, hba->nutmrs) { > > if (ufshcd_clear_tm_cmd(hba, tag)) { > > Why is skipping the ufshcd_clear_tm_cmd() calls useful in this case? > > Thanks, > > Bart. >