Re: [PATCH] Revert "arm64: dts: qcom: sm8250: Add cpuidle states"

Ulf Hansson <ulf.hansson@xxxxxxxxxx> · Tue, 18 Oct 2022 16:31:53 +0200

On Tue, 18 Oct 2022 at 16:09, Amit Pundir <amit.pundir@xxxxxxxxxx> wrote:
>
> On Tue, 18 Oct 2022 at 16:00, Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
> >
> > On Mon, 17 Oct 2022 at 22:17, Sudeep Holla <sudeep.holla@xxxxxxx> wrote:
> > >
> > > On Mon, Oct 17, 2022 at 10:10:05PM +0530, Amit Pundir wrote:
> > > > This reverts commit 32bc936d732171d48c9c8f96c4fa25ac3ed7e1c7.
> > > >
> > > > This patch was part of a patch series to add APSS RSC to
> > > > Cluster power domain
> > > > https://patchwork.kernel.org/project/linux-pm/cover/1641749107-31979-1-git-send-email-quic_mkshah@xxxxxxxxxxx/
> > > > but the rest of the patches in this series got NACKed and didn't land.
> > > >
> > > > These cpuidle states made RB5 (sm8250) highly unstable and I run into
> > > > following crash every now and then:
> > > >
> > > > [    T1] vreg_l11c_3p3: failed to enable: -ETIMEDOUT
> > > > [    T1] qcom-rpmh-regulator 18200000.rsc:pm8150l-rpmh-regulators: ldo11: devm_regulator_register() failed, ret=-110
> > > > [    T1] qcom-rpmh-regulator: probe of 18200000.rsc:pm8150l-rpmh-regulators failed with error -110
> > > >
> > > > I reported this breakage earlier this year as well:
> > > > https://lore.kernel.org/all/CAMi1Hd2Sngya_2m2odkjq4fdV8OiiXsFMEX1bb807cWMC7H-sg@xxxxxxxxxxxxxx/
> > > > I can confirm that if I cherry-pick the rest of the patches from the
> > > > series then I can't reproduce this crash, but I'm not sure when the rest
> > > > of the patches are going to land though.
> >
> > I have been talking to Maulik (offlist) about re-posting the series,
> > but apparently she has been too busy to move this forward.
> >
> > I assume a better option, than reverting, is to get the above series
> > merged. If I recall, there were only a few minor comments from me on
> > the genpd patch [1]. That said, let me help out and refresh the
> > series, I will do it asap!
> >
> > > >
> > > > Signed-off-by: Amit Pundir <amit.pundir@xxxxxxxxxx>
> > > > ---
> > > >  arch/arm64/boot/dts/qcom/sm8250.dtsi | 105 ---------------------------
> > > >  1 file changed, 105 deletions(-)
> > > >
> > > > diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi b/arch/arm64/boot/dts/qcom/sm8250.dtsi
> > > > index a5b62cadb129..a2c15da1a450 100644
> > > > --- a/arch/arm64/boot/dts/qcom/sm8250.dtsi
> > > > +++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi
> > > > @@ -101,8 +101,6 @@ CPU0: cpu@0 {
> > > >                       capacity-dmips-mhz = <448>;
> > > >                       dynamic-power-coefficient = <205>;
> > > >                       next-level-cache = <&L2_0>;
> > > > -                     power-domains = <&CPU_PD0>;
> > > > -                     power-domain-names = "psci";
> > > >                       qcom,freq-domain = <&cpufreq_hw 0>;
> > > >                       operating-points-v2 = <&cpu0_opp_table>;
> > > >                       interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > > @@ -125,8 +123,6 @@ CPU1: cpu@100 {
> > > >                       capacity-dmips-mhz = <448>;
> > > >                       dynamic-power-coefficient = <205>;
> > > >                       next-level-cache = <&L2_100>;
> > > > -                     power-domains = <&CPU_PD1>;
> > > > -                     power-domain-names = "psci";
> > > >                       qcom,freq-domain = <&cpufreq_hw 0>;
> > > >                       operating-points-v2 = <&cpu0_opp_table>;
> > > >                       interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > > @@ -146,8 +142,6 @@ CPU2: cpu@200 {
> > > >                       capacity-dmips-mhz = <448>;
> > > >                       dynamic-power-coefficient = <205>;
> > > >                       next-level-cache = <&L2_200>;
> > > > -                     power-domains = <&CPU_PD2>;
> > > > -                     power-domain-names = "psci";
> > > >                       qcom,freq-domain = <&cpufreq_hw 0>;
> > > >                       operating-points-v2 = <&cpu0_opp_table>;
> > > >                       interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > > @@ -167,8 +161,6 @@ CPU3: cpu@300 {
> > > >                       capacity-dmips-mhz = <448>;
> > > >                       dynamic-power-coefficient = <205>;
> > > >                       next-level-cache = <&L2_300>;
> > > > -                     power-domains = <&CPU_PD3>;
> > > > -                     power-domain-names = "psci";
> > > >                       qcom,freq-domain = <&cpufreq_hw 0>;
> > > >                       operating-points-v2 = <&cpu0_opp_table>;
> > > >                       interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > > @@ -188,8 +180,6 @@ CPU4: cpu@400 {
> > > >                       capacity-dmips-mhz = <1024>;
> > > >                       dynamic-power-coefficient = <379>;
> > > >                       next-level-cache = <&L2_400>;
> > > > -                     power-domains = <&CPU_PD4>;
> > > > -                     power-domain-names = "psci";
> > > >                       qcom,freq-domain = <&cpufreq_hw 1>;
> > > >                       operating-points-v2 = <&cpu4_opp_table>;
> > > >                       interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > > @@ -209,8 +199,6 @@ CPU5: cpu@500 {
> > > >                       capacity-dmips-mhz = <1024>;
> > > >                       dynamic-power-coefficient = <379>;
> > > >                       next-level-cache = <&L2_500>;
> > > > -                     power-domains = <&CPU_PD5>;
> > > > -                     power-domain-names = "psci";
> > > >                       qcom,freq-domain = <&cpufreq_hw 1>;
> > > >                       operating-points-v2 = <&cpu4_opp_table>;
> > > >                       interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > > @@ -231,8 +219,6 @@ CPU6: cpu@600 {
> > > >                       capacity-dmips-mhz = <1024>;
> > > >                       dynamic-power-coefficient = <379>;
> > > >                       next-level-cache = <&L2_600>;
> > > > -                     power-domains = <&CPU_PD6>;
> > > > -                     power-domain-names = "psci";
> > > >                       qcom,freq-domain = <&cpufreq_hw 1>;
> > > >                       operating-points-v2 = <&cpu4_opp_table>;
> > > >                       interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > > @@ -252,8 +238,6 @@ CPU7: cpu@700 {
> > > >                       capacity-dmips-mhz = <1024>;
> > > >                       dynamic-power-coefficient = <444>;
> > > >                       next-level-cache = <&L2_700>;
> > > > -                     power-domains = <&CPU_PD7>;
> > > > -                     power-domain-names = "psci";
> > > >                       qcom,freq-domain = <&cpufreq_hw 2>;
> > > >                       operating-points-v2 = <&cpu7_opp_table>;
> > > >                       interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>,
> > > > @@ -300,42 +284,6 @@ core7 {
> > > >                               };
> > > >                       };
> > > >               };
> > > > -
> > > > -             idle-states {
> > > > -                     entry-method = "psci";
> > > > -
> > > > -                     LITTLE_CPU_SLEEP_0: cpu-sleep-0-0 {
> > > > -                             compatible = "arm,idle-state";
> > > > -                             idle-state-name = "silver-rail-power-collapse";
> > > > -                             arm,psci-suspend-param = <0x40000004>;
> > > > -                             entry-latency-us = <360>;
> > > > -                             exit-latency-us = <531>;
> > > > -                             min-residency-us = <3934>;
> > > > -                             local-timer-stop;
> > >
> > > If this is temporary fix for some broke firmware or setup, I suggest to
> > > just add status = "disabled" for these states. Also worth checking if keeping
> > > the cpu states is okay and only cluster state is the issue or everything
> > > needs to be disabled. That way it would avoid the churn when re-enabling it.
> >
> > That's a good option, unless we can get the other series (that fixes
> > this issue) merged soon. As stated, I will help to re-spin it and then
> > we can take it from there.
>
> Hi Ulf, I just verified over multiple reboots that disabling the
> cpuidle states, as suggested by Sudeep, does the trick and I no longer
> see the crash.
>
> Do you suggest we wait for the re-spin of the other series or should I
> go ahead and submit that RB5 workaround for the time being?

Yes, that seems like a good idea as the problem needs an easy fix for
6.1 and earlier. Please keep me posted.

In the meantime, I will re-spin and submit the series that fixes the
problem correctly, but let's target that series for 6.2.

Kind regards
Uffe