On Tue, 18 Oct 2022 at 16:09, Amit Pundir <amit.pundir@xxxxxxxxxx> wrote: > > On Tue, 18 Oct 2022 at 16:00, Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: > > > > On Mon, 17 Oct 2022 at 22:17, Sudeep Holla <sudeep.holla@xxxxxxx> wrote: > > > > > > On Mon, Oct 17, 2022 at 10:10:05PM +0530, Amit Pundir wrote: > > > > This reverts commit 32bc936d732171d48c9c8f96c4fa25ac3ed7e1c7. > > > > > > > > This patch was part of a patch series to add APSS RSC to > > > > Cluster power domain > > > > https://patchwork.kernel.org/project/linux-pm/cover/1641749107-31979-1-git-send-email-quic_mkshah@xxxxxxxxxxx/ > > > > but the rest of the patches in this series got NACKed and didn't land. > > > > > > > > These cpuidle states made RB5 (sm8250) highly unstable and I run into > > > > following crash every now and then: > > > > > > > > [ T1] vreg_l11c_3p3: failed to enable: -ETIMEDOUT > > > > [ T1] qcom-rpmh-regulator 18200000.rsc:pm8150l-rpmh-regulators: ldo11: devm_regulator_register() failed, ret=-110 > > > > [ T1] qcom-rpmh-regulator: probe of 18200000.rsc:pm8150l-rpmh-regulators failed with error -110 > > > > > > > > I reported this breakage earlier this year as well: > > > > https://lore.kernel.org/all/CAMi1Hd2Sngya_2m2odkjq4fdV8OiiXsFMEX1bb807cWMC7H-sg@xxxxxxxxxxxxxx/ > > > > I can confirm that if I cherry-pick the rest of the patches from the > > > > series then I can't reproduce this crash, but I'm not sure when the rest > > > > of the patches are going to land though. > > > > I have been talking to Maulik (offlist) about re-posting the series, > > but apparently she has been too busy to move this forward. > > > > I assume a better option, than reverting, is to get the above series > > merged. If I recall, there were only a few minor comments from me on > > the genpd patch [1]. That said, let me help out and refresh the > > series, I will do it asap! > > > > > > > > > > Signed-off-by: Amit Pundir <amit.pundir@xxxxxxxxxx> > > > > --- > > > > arch/arm64/boot/dts/qcom/sm8250.dtsi | 105 --------------------------- > > > > 1 file changed, 105 deletions(-) > > > > > > > > diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi b/arch/arm64/boot/dts/qcom/sm8250.dtsi > > > > index a5b62cadb129..a2c15da1a450 100644 > > > > --- a/arch/arm64/boot/dts/qcom/sm8250.dtsi > > > > +++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi > > > > @@ -101,8 +101,6 @@ CPU0: cpu@0 { > > > > capacity-dmips-mhz = <448>; > > > > dynamic-power-coefficient = <205>; > > > > next-level-cache = <&L2_0>; > > > > - power-domains = <&CPU_PD0>; > > > > - power-domain-names = "psci"; > > > > qcom,freq-domain = <&cpufreq_hw 0>; > > > > operating-points-v2 = <&cpu0_opp_table>; > > > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>, > > > > @@ -125,8 +123,6 @@ CPU1: cpu@100 { > > > > capacity-dmips-mhz = <448>; > > > > dynamic-power-coefficient = <205>; > > > > next-level-cache = <&L2_100>; > > > > - power-domains = <&CPU_PD1>; > > > > - power-domain-names = "psci"; > > > > qcom,freq-domain = <&cpufreq_hw 0>; > > > > operating-points-v2 = <&cpu0_opp_table>; > > > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>, > > > > @@ -146,8 +142,6 @@ CPU2: cpu@200 { > > > > capacity-dmips-mhz = <448>; > > > > dynamic-power-coefficient = <205>; > > > > next-level-cache = <&L2_200>; > > > > - power-domains = <&CPU_PD2>; > > > > - power-domain-names = "psci"; > > > > qcom,freq-domain = <&cpufreq_hw 0>; > > > > operating-points-v2 = <&cpu0_opp_table>; > > > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>, > > > > @@ -167,8 +161,6 @@ CPU3: cpu@300 { > > > > capacity-dmips-mhz = <448>; > > > > dynamic-power-coefficient = <205>; > > > > next-level-cache = <&L2_300>; > > > > - power-domains = <&CPU_PD3>; > > > > - power-domain-names = "psci"; > > > > qcom,freq-domain = <&cpufreq_hw 0>; > > > > operating-points-v2 = <&cpu0_opp_table>; > > > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>, > > > > @@ -188,8 +180,6 @@ CPU4: cpu@400 { > > > > capacity-dmips-mhz = <1024>; > > > > dynamic-power-coefficient = <379>; > > > > next-level-cache = <&L2_400>; > > > > - power-domains = <&CPU_PD4>; > > > > - power-domain-names = "psci"; > > > > qcom,freq-domain = <&cpufreq_hw 1>; > > > > operating-points-v2 = <&cpu4_opp_table>; > > > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>, > > > > @@ -209,8 +199,6 @@ CPU5: cpu@500 { > > > > capacity-dmips-mhz = <1024>; > > > > dynamic-power-coefficient = <379>; > > > > next-level-cache = <&L2_500>; > > > > - power-domains = <&CPU_PD5>; > > > > - power-domain-names = "psci"; > > > > qcom,freq-domain = <&cpufreq_hw 1>; > > > > operating-points-v2 = <&cpu4_opp_table>; > > > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>, > > > > @@ -231,8 +219,6 @@ CPU6: cpu@600 { > > > > capacity-dmips-mhz = <1024>; > > > > dynamic-power-coefficient = <379>; > > > > next-level-cache = <&L2_600>; > > > > - power-domains = <&CPU_PD6>; > > > > - power-domain-names = "psci"; > > > > qcom,freq-domain = <&cpufreq_hw 1>; > > > > operating-points-v2 = <&cpu4_opp_table>; > > > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>, > > > > @@ -252,8 +238,6 @@ CPU7: cpu@700 { > > > > capacity-dmips-mhz = <1024>; > > > > dynamic-power-coefficient = <444>; > > > > next-level-cache = <&L2_700>; > > > > - power-domains = <&CPU_PD7>; > > > > - power-domain-names = "psci"; > > > > qcom,freq-domain = <&cpufreq_hw 2>; > > > > operating-points-v2 = <&cpu7_opp_table>; > > > > interconnects = <&gem_noc MASTER_AMPSS_M0 &mc_virt SLAVE_EBI_CH0>, > > > > @@ -300,42 +284,6 @@ core7 { > > > > }; > > > > }; > > > > }; > > > > - > > > > - idle-states { > > > > - entry-method = "psci"; > > > > - > > > > - LITTLE_CPU_SLEEP_0: cpu-sleep-0-0 { > > > > - compatible = "arm,idle-state"; > > > > - idle-state-name = "silver-rail-power-collapse"; > > > > - arm,psci-suspend-param = <0x40000004>; > > > > - entry-latency-us = <360>; > > > > - exit-latency-us = <531>; > > > > - min-residency-us = <3934>; > > > > - local-timer-stop; > > > > > > If this is temporary fix for some broke firmware or setup, I suggest to > > > just add status = "disabled" for these states. Also worth checking if keeping > > > the cpu states is okay and only cluster state is the issue or everything > > > needs to be disabled. That way it would avoid the churn when re-enabling it. > > > > That's a good option, unless we can get the other series (that fixes > > this issue) merged soon. As stated, I will help to re-spin it and then > > we can take it from there. > > Hi Ulf, I just verified over multiple reboots that disabling the > cpuidle states, as suggested by Sudeep, does the trick and I no longer > see the crash. > > Do you suggest we wait for the re-spin of the other series or should I > go ahead and submit that RB5 workaround for the time being? Yes, that seems like a good idea as the problem needs an easy fix for 6.1 and earlier. Please keep me posted. In the meantime, I will re-spin and submit the series that fixes the problem correctly, but let's target that series for 6.2. Kind regards Uffe