Re: [PATCH 2/3] arm64: dts: qcom: sm8150: Add PSCI idle states

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 29 Dec 17:19 CST 2020, Danny Lin wrote:

> On Mon, Dec 28, 2020 at10:02 AM, Bjorn Andersson wrote:
> > On Tue 22 Dec 20:00 CST 2020, Danny Lin wrote:
> > 
> > > On Sun, Dec 20, 2020 at 7:48 PM, Bjorn Andersson wrote:
> > > > On Sun 20 Dec 16:29 PST 2020, Danny Lin wrote:
> > > > 
> > > > > Like other Qualcomm SoCs, sm8150 exposes CPU and cluster idle states
> > > > > through PSCI. Define the idle states to save power when the CPU is not
> > > > > in active use.
> > > > > 
> > > > > These idle states, latency, and residency values match the downstream
> > > > > 4.14 kernel from Qualcomm as of LA.UM.8.1.r1-15600-sm8150.0.
> > > > > 
> > > > > It's worth noting that the CPU has an additional C3 power collapse idle
> > > > > state between WFI and rail power collapse (with PSCI mode 0x40000003),
> > > > > but it is not officially used in downstream kernels due to "thermal
> > > > > throttling issues."
> > > > > 
> > > > 
> > > > Thanks Danny for this series, very happy to see this kind of additions.
> > > > Just one small question about the cluster param below.
> > > > 
> > > > > Signed-off-by: Danny Lin <danny@xxxxxxxxxxx>
> > > > > ---
> > > > >  arch/arm64/boot/dts/qcom/sm8150.dtsi | 50 ++++++++++++++++++++++++++++
> > > > >  1 file changed, 50 insertions(+)
> > > > > 
> > > > > diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi b/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > > > > index 75ed38ee5d88..edc1fe6d7f1b 100644
> > > > > --- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > > > > +++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > > > > @@ -50,6 +50,8 @@ CPU0: cpu@0 {
> > > > >  			compatible = "qcom,kryo485";
> > > > >  			reg = <0x0 0x0>;
> > > > >  			enable-method = "psci";
> > > > > +			cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > > > +					   &CLUSTER_SLEEP_0>;
> > > > >  			next-level-cache = <&L2_0>;
> > > > >  			qcom,freq-domain = <&cpufreq_hw 0>;
> > > > >  			#cooling-cells = <2>;
> > > > > @@ -67,6 +69,8 @@ CPU1: cpu@100 {
> > > > >  			compatible = "qcom,kryo485";
> > > > >  			reg = <0x0 0x100>;
> > > > >  			enable-method = "psci";
> > > > > +			cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > > > +					   &CLUSTER_SLEEP_0>;
> > > > >  			next-level-cache = <&L2_100>;
> > > > >  			qcom,freq-domain = <&cpufreq_hw 0>;
> > > > >  			#cooling-cells = <2>;
> > > > > @@ -82,6 +86,8 @@ CPU2: cpu@200 {
> > > > >  			compatible = "qcom,kryo485";
> > > > >  			reg = <0x0 0x200>;
> > > > >  			enable-method = "psci";
> > > > > +			cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > > > +					   &CLUSTER_SLEEP_0>;
> > > > >  			next-level-cache = <&L2_200>;
> > > > >  			qcom,freq-domain = <&cpufreq_hw 0>;
> > > > >  			#cooling-cells = <2>;
> > > > > @@ -96,6 +102,8 @@ CPU3: cpu@300 {
> > > > >  			compatible = "qcom,kryo485";
> > > > >  			reg = <0x0 0x300>;
> > > > >  			enable-method = "psci";
> > > > > +			cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > > > +					   &CLUSTER_SLEEP_0>;
> > > > >  			next-level-cache = <&L2_300>;
> > > > >  			qcom,freq-domain = <&cpufreq_hw 0>;
> > > > >  			#cooling-cells = <2>;
> > > > > @@ -110,6 +118,8 @@ CPU4: cpu@400 {
> > > > >  			compatible = "qcom,kryo485";
> > > > >  			reg = <0x0 0x400>;
> > > > >  			enable-method = "psci";
> > > > > +			cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > > > +					   &CLUSTER_SLEEP_0>;
> > > > >  			next-level-cache = <&L2_400>;
> > > > >  			qcom,freq-domain = <&cpufreq_hw 1>;
> > > > >  			#cooling-cells = <2>;
> > > > > @@ -124,6 +134,8 @@ CPU5: cpu@500 {
> > > > >  			compatible = "qcom,kryo485";
> > > > >  			reg = <0x0 0x500>;
> > > > >  			enable-method = "psci";
> > > > > +			cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > > > +					   &CLUSTER_SLEEP_0>;
> > > > >  			next-level-cache = <&L2_500>;
> > > > >  			qcom,freq-domain = <&cpufreq_hw 1>;
> > > > >  			#cooling-cells = <2>;
> > > > > @@ -138,6 +150,8 @@ CPU6: cpu@600 {
> > > > >  			compatible = "qcom,kryo485";
> > > > >  			reg = <0x0 0x600>;
> > > > >  			enable-method = "psci";
> > > > > +			cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > > > +					   &CLUSTER_SLEEP_0>;
> > > > >  			next-level-cache = <&L2_600>;
> > > > >  			qcom,freq-domain = <&cpufreq_hw 1>;
> > > > >  			#cooling-cells = <2>;
> > > > > @@ -152,6 +166,8 @@ CPU7: cpu@700 {
> > > > >  			compatible = "qcom,kryo485";
> > > > >  			reg = <0x0 0x700>;
> > > > >  			enable-method = "psci";
> > > > > +			cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > > > +					   &CLUSTER_SLEEP_0>;
> > > > >  			next-level-cache = <&L2_700>;
> > > > >  			qcom,freq-domain = <&cpufreq_hw 2>;
> > > > >  			#cooling-cells = <2>;
> > > > > @@ -196,6 +212,40 @@ core7 {
> > > > >  				};
> > > > >  			};
> > > > >  		};
> > > > > +
> > > > > +		idle-states {
> > > > > +			entry-method = "psci";
> > > > > +
> > > > > +			LITTLE_CPU_SLEEP_0: cpu-sleep-0-0 {
> > > > > +				compatible = "arm,idle-state";
> > > > > +				idle-state-name = "little-rail-power-collapse";
> > > > > +				arm,psci-suspend-param = <0x40000004>;
> > > > > +				entry-latency-us = <355>;
> > > > > +				exit-latency-us = <909>;
> > > > > +				min-residency-us = <3934>;
> > > > > +				local-timer-stop;
> > > > > +			};
> > > > > +
> > > > > +			BIG_CPU_SLEEP_0: cpu-sleep-1-0 {
> > > > > +				compatible = "arm,idle-state";
> > > > > +				idle-state-name = "big-rail-power-collapse";
> > > > > +				arm,psci-suspend-param = <0x40000004>;
> > > > > +				entry-latency-us = <241>;
> > > > > +				exit-latency-us = <1461>;
> > > > > +				min-residency-us = <4488>;
> > > > > +				local-timer-stop;
> > > > > +			};
> > > > > +
> > > > > +			CLUSTER_SLEEP_0: cluster-sleep-0 {
> > > > > +				compatible = "arm,idle-state";
> > > > > +				idle-state-name = "cluster-power-collapse";
> > > > > +				arm,psci-suspend-param = <0x400000F4>;
> > > > 
> > > > How come this is 0xf4?
> > > > 
> > > > Isn't downstream saying that this should be either 0x1 << 4 or 0xc24 <<
> > > > 4, depending on how deep we want to go? Could we at least mention why
> > > > this is 0xf4?
> > > 
> > > I'm not sure where 0x400000F4 originally came from. I noticed that
> > > sdm845 uses the same 0xc24 mode in downstream, but Qualcomm used
> > > 0x400000F4 in mainline.
> > > 
> > > I did some testing on a downstream kernel and found that the real value
> > > it uses on sm8150 is 0x4100c244, but the idle state doesn't work at all
> > > if I use the same value on mainline. The logic appears to be the same in
> > > the downstream sdm845 kernel. Maybe it has to do with how downstream has
> > > "notify RPM" before attempting to enter the idle state?
> > > 
> > > In downstream, the final PSCI value is calculated as the sum of:
> > > 
> > > 1. (cluster-mode & cluster-mode-mask) << cluster-mode-shift = (0xc24 & 0xfff) << 4 = 0xc240
> > > 2. (is-reset << 30) = 0x40000000
> > > 3. (affinity level & 0x3) << 24 = 0x1000000
> > > 4. (cpu-mode) = 0x4
> > > 
> > > so 0xc240 + 0x40000000 + 0x1000000 + 0x4 = 0x4100c244.
> > > 
> > > It's also possible that the problem comes from the cluster idle state
> > > needing all CPUs in the cluster to be asleep (as far as I know), since
> > > it doesn't look like mainline handles that.
> > > 
> > 
> > Thanks for the explanation. I believe we have the code in place to do
> > OSI sleep using the "psci domain cpuidle" driver, but I'm not entirely
> > sure about the details about it - perhaps it's just a matter of wiring
> > it all up(?).
> > 
> > Let's go with your current patches and then swing back to this once
> > we've figured out the remaining details.
> 
> Following your hint, I was able to get cluster idle working using power
> domain idle states. The cluster idle state is now successfully using the
> same value as downstream with no apparent issues, and individual CPU
> idle states are still working. Time spent in the cluster idle state
> increases when and only when all CPUs are idle, which matches the
> expected behavior.
> 

Really interesting, thanks for pursuing this!

> Should I send a separate patch for it or revise this series? It might be
> helpful for future reference to keep a record of how to convert the
> current 0xf4 cluster states on modern Qualcomm SoCs in the commit
> history.
> 

I did go ahead and merge this series last week, and I like the idea of
"documenting" how the difference - so please send this as a separate
patch(es).

Thanks,
Bjorn

> >
> > Thanks,
> > Bjorn
> > 
> > > > 
> > > > Regards,
> > > > Bjorn
> > > > 
> > > > > +				entry-latency-us = <3263>;
> > > > > +				exit-latency-us = <6562>;
> > > > > +				min-residency-us = <9987>;
> > > > > +				local-timer-stop;
> > > > > +			};
> > > > > +		};
> > > > >  	};
> > > > >  
> > > > >  	firmware {
> > > > > -- 
> > > > > 2.29.2
> > > > > 
> > > >
> >



[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]


  Powered by Linux