On Mon, Aug 12, 2024 at 02:24:31PM +0800, Shawn Lin wrote: > 在 2024/8/12 12:10, Manivannan Sadhasivam 写道: > > On Mon, Aug 12, 2024 at 09:28:26AM +0800, Shawn Lin wrote: > > > JHi Mani, > > > > > > 在 2024/8/10 17:28, Manivannan Sadhasivam 写道: > > > > On Fri, Aug 09, 2024 at 04:16:41PM +0800, Shawn Lin wrote: > > > > > > > > [...] > > > > > > > > > > > +static int ufs_rockchip_hce_enable_notify(struct ufs_hba *hba, > > > > > > > + enum ufs_notify_change_status status) > > > > > > > +{ > > > > > > > + int err = 0; > > > > > > > + > > > > > > > + if (status == PRE_CHANGE) { > > > > > > > + int retry_outer = 3; > > > > > > > + int retry_inner; > > > > > > > +start: > > > > > > > + if (ufshcd_is_hba_active(hba)) > > > > > > > + /* change controller state to "reset state" */ > > > > > > > + ufshcd_hba_stop(hba); > > > > > > > + > > > > > > > + /* UniPro link is disabled at this point */ > > > > > > > + ufshcd_set_link_off(hba); > > > > > > > + > > > > > > > + /* start controller initialization sequence */ > > > > > > > + ufshcd_writel(hba, CONTROLLER_ENABLE, REG_CONTROLLER_ENABLE); > > > > > > > + > > > > > > > + usleep_range(100, 200); > > > > > > > + > > > > > > > + /* wait for the host controller to complete initialization */ > > > > > > > + retry_inner = 50; > > > > > > > + while (!ufshcd_is_hba_active(hba)) { > > > > > > > + if (retry_inner) { > > > > > > > + retry_inner--; > > > > > > > + } else { > > > > > > > + dev_err(hba->dev, > > > > > > > + "Controller enable failed\n"); > > > > > > > + if (retry_outer) { > > > > > > > + retry_outer--; > > > > > > > + goto start; > > > > > > > + } > > > > > > > + return -EIO; > > > > > > > + } > > > > > > > + usleep_range(1000, 1100); > > > > > > > + } > > > > > > > > > > > > You just duplicated ufshcd_hba_execute_hce() here. Why? This doesn't make sense. > > > > > > > > > > Since we set UFSHCI_QUIRK_BROKEN_HCE, and we also need to do someting > > > > > which is very similar to ufshcd_hba_execute_hce(), before calling > > > > > ufshcd_dme_reset(). Similar but not totally the same. I'll try to see if > > > > > we can export ufshcd_hba_execute_hce() to make full use of it. > > > > > > > > > > > > > But you are starting the controller using REG_CONTROLLER_ENABLE. Isn't that > > > > supposed to be broken if you set UFSHCI_QUIRK_BROKEN_HCE? Or I am > > > > misunderstanding the quirk? > > > > > > > > > > Our controller doesn't work with exiting code, whether setting > > > UFSHCI_QUIRK_BROKEN_HCE or not. > > > > > > > Okay. Then this means you do not need this quirk at all. > > > > > > > > For UFSHCI_QUIRK_BROKEN_HCE case, it calls ufshcd_dme_reset()first, > > > but we need to set REG_CONTROLLER_ENABLE first. > > > > > > For !UFSHCI_QUIRK_BROKEN_HCE case, namly ufshcd_hba_execute_hce, it > > > sets REG_CONTROLLER_ENABLE first but never send DMA_RESET and calls > > > ufshcd_dme_enable. > > > > > > > I don't see where ufshcd_dme_enable() is getting called for > > !UFSHCI_QUIRK_BROKEN_HCE case. > > > > > So the closet code path is to go through UFSHCI_QUIRK_BROKEN_HCE case, > > > and set REG_CONTROLLER_ENABLE by adding hce_enable_notify hook. > > > > > > > No, that is abusing the quirk. But I'm confused about why your controller wants > > resetting the unipro stack _after_ enabling the controller? Why can't it be > > reset before? > > > > It can't be. The DME_RESET to reset the unipro stack will be failed > without enabling REG_CONTROLLER_ENABLE. And the controller does want us > to reset the unipro stack before other coming UICs. > > So I considered it's a kind of broken HCE case as well. Should I add a > new quirk or add a new hba_enable hook in ufs_hba_variant_ops? Or just > use UFSHCI_QUIRK_BROKEN_HCE ? > IMO, you should add a new quirk and use it directly in ufshcd_hba_execute_hce(). But you need to pick the quirk name as per the actual quirky behavior of the controller. > > > > > > > > > > > > > + } else { /* POST_CHANGE */ > > > > > > > + err = ufshcd_vops_phy_initialization(hba); > > > > > > > + } > > > > > > > + > > > > > > > + return err; > > > > > > > +} > > > > > > > + > > > > [...] > > > > > > > > > +static const struct dev_pm_ops ufs_rockchip_pm_ops = { > > > > > > > + SET_SYSTEM_SLEEP_PM_OPS(ufs_rockchip_suspend, ufs_rockchip_resume) > > > > > > > + SET_RUNTIME_PM_OPS(ufs_rockchip_runtime_suspend, ufs_rockchip_runtime_resume, NULL) > > > > > > > > > > > > Why can't you use ufshcd PM ops as like other vendor drivers? > > > > > > > > > > It doesn't work from the test. We have many use case to power down the > > > > > controller and device, so there is no flow to recovery the link. Only > > > > > when the first accessing to UFS fails, the ufshcd error handle recovery the > > > > > link. This is not what we expect. > > > > > > > > > > > > > What tests? The existing UFS controller drivers are used in production devices > > > > and they never had a usecase to invent their own PM callbacks. So if your > > > > controller is special, then you need to justify it more elaborately. If > > > > something is missing in ufshcd callbacks, then we can add them. > > > > > > > > > > All the register got lost each time as we power down both controller & PHY > > > and devices in suspend. > > > > Which suspend? runtime or system suspend? I believe system suspend. > > Both. > With {rpm/spm}_lvl = 3, you should not power down the controller. > > > > > So we have to restore the necessary > > > registers and link. I didn't see where the code recovery the controller > > > settings in ufshcd_resume, except ufshcd_err_handler()triggers that. > > > Am I missing any thing? > > > > Can you explain what is causing the powerdown of the controller and PHY? > > Because, ufshcd_suspend() just turns off the clocks and regulators (if > > UFSHCD_CAP_AGGR_POWER_COLLAPSE is set) and spm_lvl 3 set by this driver only > > puts the device in sleep mode and link in hibern8 state. > > > > For runtime PM case, it's the power-domain driver will power down the > controller and PHY if UFS stack is not active any more(autosuspend). > > For system PM case, it's the SoC's firmware to cutting of all the power > for controller/PHY and device. > Both cases are not matching the expectations of {rpm/spm}_lvl. So the platform (power domain or the firmware) should be fixed. What if the user sets the {rpm/spm}_lvl to 1? Will the platform power down the controller even then? If so, then I'd say that the platform is broken and should be fixed. - Mani -- மணிவண்ணன் சதாசிவம்