Re: [PATCH v2 3/3] scsi: ufs: rockchip: init support for UFS

Shawn Lin <shawn.lin@xxxxxxxxxxxxxx> · Tue, 13 Aug 2024 11:52:28 +0800

Hi Mani,

在 2024/8/13 0:55, Manivannan Sadhasivam 写道:
On Mon, Aug 12, 2024 at 02:24:31PM +0800, Shawn Lin wrote:
在 2024/8/12 12:10, Manivannan Sadhasivam 写道:
On Mon, Aug 12, 2024 at 09:28:26AM +0800, Shawn Lin wrote:
JHi Mani,

在 2024/8/10 17:28, Manivannan Sadhasivam 写道:
On Fri, Aug 09, 2024 at 04:16:41PM +0800, Shawn Lin wrote:

[...]

+static int ufs_rockchip_hce_enable_notify(struct ufs_hba *hba,
+					 enum ufs_notify_change_status status)
+{
+	int err = 0;
+
+	if (status == PRE_CHANGE) {
+		int retry_outer = 3;
+		int retry_inner;
+start:
+		if (ufshcd_is_hba_active(hba))
+			/* change controller state to "reset state" */
+			ufshcd_hba_stop(hba);
+
+		/* UniPro link is disabled at this point */
+		ufshcd_set_link_off(hba);
+
+		/* start controller initialization sequence */
+		ufshcd_writel(hba, CONTROLLER_ENABLE, REG_CONTROLLER_ENABLE);
+
+		usleep_range(100, 200);
+
+		/* wait for the host controller to complete initialization */
+		retry_inner = 50;
+		while (!ufshcd_is_hba_active(hba)) {
+			if (retry_inner) {
+				retry_inner--;
+			} else {
+				dev_err(hba->dev,
+					"Controller enable failed\n");
+				if (retry_outer) {
+					retry_outer--;
+					goto start;
+				}
+				return -EIO;
+			}
+			usleep_range(1000, 1100);
+		}

You just duplicated ufshcd_hba_execute_hce() here. Why? This doesn't make sense.

Since we set UFSHCI_QUIRK_BROKEN_HCE, and we also need to do someting
which is very similar to ufshcd_hba_execute_hce(), before calling
ufshcd_dme_reset(). Similar but not totally the same. I'll try to see if
we can export ufshcd_hba_execute_hce() to make full use of it.

But you are starting the controller using REG_CONTROLLER_ENABLE. Isn't that
supposed to be broken if you set UFSHCI_QUIRK_BROKEN_HCE? Or I am
misunderstanding the quirk?

Our controller doesn't work with exiting code, whether setting
UFSHCI_QUIRK_BROKEN_HCE or not.

Okay. Then this means you do not need this quirk at all.

For UFSHCI_QUIRK_BROKEN_HCE case, it calls ufshcd_dme_reset（）first,
but we need to set REG_CONTROLLER_ENABLE first.

For !UFSHCI_QUIRK_BROKEN_HCE case, namly ufshcd_hba_execute_hce, it
sets REG_CONTROLLER_ENABLE  first but never send DMA_RESET and calls
ufshcd_dme_enable.

I don't see where ufshcd_dme_enable() is getting called for
!UFSHCI_QUIRK_BROKEN_HCE case.

So the closet code path is to go through UFSHCI_QUIRK_BROKEN_HCE case,
and set REG_CONTROLLER_ENABLE by adding hce_enable_notify hook.

No, that is abusing the quirk. But I'm confused about why your controller wants
resetting the unipro stack _after_ enabling the controller? Why can't it be
reset before?

It can't be. The DME_RESET to reset the unipro stack will be failed
without enabling REG_CONTROLLER_ENABLE. And the controller does want us
to reset the unipro stack before other coming UICs.

So I considered it's a kind of broken HCE case as well. Should I add a
new quirk or add a new hba_enable hook in ufs_hba_variant_ops? Or just
use UFSHCI_QUIRK_BROKEN_HCE ?

IMO, you should add a new quirk and use it directly in ufshcd_hba_execute_hce().
But you need to pick the quirk name as per the actual quirky behavior of the
controller.

Thanks, Main. I'll add a new quirk for
ufshcd_hba_execute_hce() as per the actual quirky behavour.

+	} else { /* POST_CHANGE */
+		err = ufshcd_vops_phy_initialization(hba);
+	}
+
+	return err;
+}
+

[...]

+static const struct dev_pm_ops ufs_rockchip_pm_ops = {
+	SET_SYSTEM_SLEEP_PM_OPS(ufs_rockchip_suspend, ufs_rockchip_resume)
+	SET_RUNTIME_PM_OPS(ufs_rockchip_runtime_suspend, ufs_rockchip_runtime_resume, NULL)

Why can't you use ufshcd PM ops as like other vendor drivers?

It doesn't work from the test. We have many use case to power down the
controller and device, so there is no flow to recovery the link. Only
when the first accessing to UFS fails, the ufshcd error handle recovery the
link. This is not what we expect.

What tests? The existing UFS controller drivers are used in production devices
and they never had a usecase to invent their own PM callbacks. So if your
controller is special, then you need to justify it more elaborately. If
something is missing in ufshcd callbacks, then we can add them.

All the register got lost each time as we power down both controller & PHY
and devices in suspend.

Which suspend? runtime or system suspend? I believe system suspend.

Both.

With {rpm/spm}_lvl = 3, you should not power down the controller.

So we have to restore the necessary
registers and link. I didn't see where the code recovery the controller
settings in ufshcd_resume, except ufshcd_err_handler（）triggers that.
Am I missing any thing?

Can you explain what is causing the powerdown of the controller and PHY?
Because, ufshcd_suspend() just turns off the clocks and regulators (if
UFSHCD_CAP_AGGR_POWER_COLLAPSE is set) and spm_lvl 3 set by this driver only
puts the device in sleep mode and link in hibern8 state.

For runtime PM case, it's the power-domain driver will power down the
controller and PHY if UFS stack is not active any more（autosuspend）.

For system PM case, it's the SoC's firmware to cutting of all the power
for controller/PHY and device.

Both cases are not matching the expectations of {rpm/spm}_lvl. So the platform
(power domain or the firmware) should be fixed. What if the user sets the
{rpm/spm}_lvl to 1? Will the platform power down the controller even then? If
so, then I'd say that the platform is broken and should be fixed.

Ok, it seems I need to set {rpm/spm}_lvl = 6 if I want platform to power
down the controller for ultra power-saving. But I still need to add my
own system PM callback in that case to recovery the link first. Do I
misunderstand it?

And for the user who sets the rpm/spm level via
ufs_sysfs_pm_lvl_store(), I think there is no way to block it currently,
except that we need to fix the power-domain driver and Firmware to
respect the level and choose correct policy.

So in summary for what the next step I should to:
(1) Set {rpm/spm}_lvl = 6 in host driver to reflect the expectation
(2) Add own PM callbacks to recovery the link to meet the expectation
(3) Fix the broken behaviour of PD or Firmware to respect the actual
desired pm level if user changes the pm level.

Does that sound feasible to you?

Thanks.

- Mani