Re: [PATCH v2 3/3] scsi: ufs: rockchip: init support for UFS

Shawn Lin <shawn.lin@xxxxxxxxxxxxxx> · Tue, 13 Aug 2024 15:22:45 +0800

在 2024/8/13 11:52, Shawn Lin 写道:
Hi Mani,

在 2024/8/13 0:55, Manivannan Sadhasivam 写道:
On Mon, Aug 12, 2024 at 02:24:31PM +0800, Shawn Lin wrote:
在 2024/8/12 12:10, Manivannan Sadhasivam 写道:
On Mon, Aug 12, 2024 at 09:28:26AM +0800, Shawn Lin wrote:
JHi Mani,

在 2024/8/10 17:28, Manivannan Sadhasivam 写道:
On Fri, Aug 09, 2024 at 04:16:41PM +0800, Shawn Lin wrote:

[...]

+static int ufs_rockchip_hce_enable_notify(struct ufs_hba *hba,
+                     enum ufs_notify_change_status status)
+{
+    int err = 0;
+
+    if (status == PRE_CHANGE) {
+        int retry_outer = 3;
+        int retry_inner;
+start:
+        if (ufshcd_is_hba_active(hba))
+            /* change controller state to "reset state" */
+            ufshcd_hba_stop(hba);
+
+        /* UniPro link is disabled at this point */
+        ufshcd_set_link_off(hba);
+
+        /* start controller initialization sequence */
+        ufshcd_writel(hba, CONTROLLER_ENABLE, 
REG_CONTROLLER_ENABLE);
+
+        usleep_range(100, 200);
+
+        /* wait for the host controller to complete 
initialization */
+        retry_inner = 50;
+        while (!ufshcd_is_hba_active(hba)) {
+            if (retry_inner) {
+                retry_inner--;
+            } else {
+                dev_err(hba->dev,
+                    "Controller enable failed\n");
+                if (retry_outer) {
+                    retry_outer--;
+                    goto start;
+                }
+                return -EIO;
+            }
+            usleep_range(1000, 1100);
+        }

You just duplicated ufshcd_hba_execute_hce() here. Why? This 
doesn't make sense.

Since we set UFSHCI_QUIRK_BROKEN_HCE, and we also need to do 
someting
which is very similar to ufshcd_hba_execute_hce(), before calling
ufshcd_dme_reset(). Similar but not totally the same. I'll try to 
see if
we can export ufshcd_hba_execute_hce() to make full use of it.

But you are starting the controller using REG_CONTROLLER_ENABLE. 
Isn't that
supposed to be broken if you set UFSHCI_QUIRK_BROKEN_HCE? Or I am
misunderstanding the quirk?

Our controller doesn't work with exiting code, whether setting
UFSHCI_QUIRK_BROKEN_HCE or not.

Okay. Then this means you do not need this quirk at all.

For UFSHCI_QUIRK_BROKEN_HCE case, it calls ufshcd_dme_reset（）first,
but we need to set REG_CONTROLLER_ENABLE first.

For !UFSHCI_QUIRK_BROKEN_HCE case, namly ufshcd_hba_execute_hce, it
sets REG_CONTROLLER_ENABLE  first but never send DMA_RESET and calls
ufshcd_dme_enable.

I don't see where ufshcd_dme_enable() is getting called for
!UFSHCI_QUIRK_BROKEN_HCE case.

So the closet code path is to go through UFSHCI_QUIRK_BROKEN_HCE case,
and set REG_CONTROLLER_ENABLE by adding hce_enable_notify hook.

No, that is abusing the quirk. But I'm confused about why your 
controller wants
resetting the unipro stack _after_ enabling the controller? Why 
can't it be
reset before?

It can't be. The DME_RESET to reset the unipro stack will be failed
without enabling REG_CONTROLLER_ENABLE. And the controller does want us
to reset the unipro stack before other coming UICs.

So I considered it's a kind of broken HCE case as well. Should I add a
new quirk or add a new hba_enable hook in ufs_hba_variant_ops? Or just
use UFSHCI_QUIRK_BROKEN_HCE ?

IMO, you should add a new quirk and use it directly in 
ufshcd_hba_execute_hce().
But you need to pick the quirk name as per the actual quirky behavior 
of the
controller.

Thanks, Main. I'll add a new quirk for
ufshcd_hba_execute_hce() as per the actual quirky behavour.

+    } else { /* POST_CHANGE */
+        err = ufshcd_vops_phy_initialization(hba);
+    }
+
+    return err;
+}
+

[...]

+static const struct dev_pm_ops ufs_rockchip_pm_ops = {
+    SET_SYSTEM_SLEEP_PM_OPS(ufs_rockchip_suspend, 
ufs_rockchip_resume)
+    SET_RUNTIME_PM_OPS(ufs_rockchip_runtime_suspend, 
ufs_rockchip_runtime_resume, NULL)

Why can't you use ufshcd PM ops as like other vendor drivers?

It doesn't work from the test. We have many use case to power 
down the
controller and device, so there is no flow to recovery the link. 
Only
when the first accessing to UFS fails, the ufshcd error handle 
recovery the
link. This is not what we expect.

What tests? The existing UFS controller drivers are used in 
production devices
and they never had a usecase to invent their own PM callbacks. So 
if your
controller is special, then you need to justify it more 
elaborately. If
something is missing in ufshcd callbacks, then we can add them.

All the register got lost each time as we power down both 
controller & PHY
and devices in suspend.

Which suspend? runtime or system suspend? I believe system suspend.

Both.

With {rpm/spm}_lvl = 3, you should not power down the controller.

So we have to restore the necessary
registers and link. I didn't see where the code recovery the 
controller
settings in ufshcd_resume, except ufshcd_err_handler（）triggers that.
Am I missing any thing?

Can you explain what is causing the powerdown of the controller and 
PHY?
Because, ufshcd_suspend() just turns off the clocks and regulators (if
UFSHCD_CAP_AGGR_POWER_COLLAPSE is set) and spm_lvl 3 set by this 
driver only
puts the device in sleep mode and link in hibern8 state.

For runtime PM case, it's the power-domain driver will power down the
controller and PHY if UFS stack is not active any more（autosuspend）.

For system PM case, it's the SoC's firmware to cutting of all the power
for controller/PHY and device.

Both cases are not matching the expectations of {rpm/spm}_lvl. So the 
platform
(power domain or the firmware) should be fixed. What if the user sets the
{rpm/spm}_lvl to 1? Will the platform power down the controller even 
then? If
so, then I'd say that the platform is broken and should be fixed.

Ok, it seems I need to set {rpm/spm}_lvl = 6 if I want platform to power
down the controller for ultra power-saving. But I still need to add my
own system PM callback in that case to recovery the link first. Do I
misunderstand it?

And for the user who sets the rpm/spm level via
ufs_sysfs_pm_lvl_store(), I think there is no way to block it currently,
except that we need to fix the power-domain driver and Firmware to
respect the level and choose correct policy.

So in summary for what the next step I should to:
(1) Set {rpm/spm}_lvl = 6 in host driver to reflect the expectation
(2) Add own PM callbacks to recovery the link to meet the expectation
(3) Fix the broken behaviour of PD or Firmware to respect the actual
desired pm level if user changes the pm level.

Sorry, I misunderstood your comment, so the action should be
(1) Set {rpm/spm}_lvl = 5 in host driver to reflect the expectation
(2) Use ufshcd_system_suspend/resume, but keep our own runtime PM
callbacks as we need a extra step to gate refclk.
(3) Fix the broken behaviour of PD or Firmware to respect the actual
desired pm level if user changes the pm level.

Does that sound feasible to you?

Thanks.

- Mani