[draft] Re: [PATCH 02/14] dt/bindings: update binding for PM domain idle states

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi folks,

I'm trying to start a conversation with Lina about using the
power-domains DT bindings for describing the EAS power model.

Partly because I'm still waiting for fossmail approval so I can't send
this email to anyone without an @arm.com address, I wanted to circulate
a draft internally before I make a fool of myself pubicly; does this
mail make sense to you guys (also please let me know if I'm egregiously
breaking linux-pm etiquette)?

Cheers,
Brendan

======================================================================

Hi Lina, Mark,

I'm currently investigating a generic way of expressing in DT the power
costs of CPU idle states, for use by the energy model for Energy Aware
Scheduling (EAS) [1][2].

Essentially what I want to do is probably add a "power-usage" property
to idle state nodes. I have heard that you and/or Kevin Hilman have
actually experimented with something similar.

However, the model is required to be topologically aware: for each idle
state, we want to know the power usage of the CPU, as well as of the
cluster-level resources it shares. This has led me to this patchset,
since the CPU power domain bindings allow you to tell which topology
level "owns" an idle state.

I guess my point here is that these bindings have another use-case
besides OS-coordinated cluster idle. I've hacked up a patch (attached)
for my experimentation which allows the bindings as described in this
mail to be used without any of the actual CPU Runtime PM implementation

I share Mark's concern about the duplication between cpu-idle-states in
CPU nodes and power-states in power domain nodes. As I understand it,
with this patchset the boundary between cpu-idle-states and power-states
nodes translates to the boundary between states managed by cpuidle and
states managed by Runtime PM.

Mark hinted at a solution where idle state nodes refer to the power
domains they affect. I guess this solution would look something like:

CPU_0: cpu@0 {
       compatible = "example-cpu";
       reg = <0x0 0x0>;
       device_type = "cpu";
       enable-method = "psci";
       cpu-idle-states = <&CPU_SLEEP &CLUSTER_SLEEP>;
       power-domains = <&CLUSTER_PD>;
};

/* ... elsewhere ... */

idle-states {
       CPU_SLEEP: cpu-sleep {
               compatible = "arm,idle-state";
               arm,psci-suspend-param = <0x0010000>;
               entry-latency-us = <100>;
               exit-latency-us = <250>;
               min-residency-us = <150>;
       };

       CLUSTER_SLEEP: cluster-sleep {
               compatible = "arm,idle-state";
               arm,psci-suspend-param = <0x1010000>;
               entry-latency-us = <800>;
               exit-latency-us = <700>;
               min-residency-us = <2500>;
               affects-power-domains = <&CLUSTER_PD>
       };
};

Then if desired/necessary, CLUSTER_SLEEP can be taken care of by Runtime
PM (i.e. become a genpd_power_state) rather than cpuidle, the deciding
factor being that it has an affects-power-domain property. (In this
model, idle state nodes don't have a power-states property at all.)

Mark, have I understood you correctly, is that what you were imagining?


On 23/06/16 19:39, Lina Iyer wrote:
On Thu, Jun 23 2016 at 12:19 -0600, Mark Rutland wrote:
On Thu, Jun 23, 2016 at 12:04:51PM -0600, Lina Iyer wrote:
Hi Mark,

On Thu, Jun 23 2016 at 11:35 -0600, Mark Rutland wrote:
>Hi,
>
>On Wed, Jun 22, 2016 at 01:36:37PM -0600, Lina Iyer wrote:
>>From: Axel Haslam <ahaslam+renesas@xxxxxxxxxxxx>
>>
>>Update DT bindings to describe idle states of PM domains.
>>
>>Cc: <devicetree@xxxxxxxxxxxxxxx>
>>Signed-off-by: Marc Titinger <mtitinger+renesas@xxxxxxxxxxxx>
>>Signed-off-by: Lina Iyer <lina.iyer@xxxxxxxxxx>
>>[Lina: Added state properties, removed state names, wakeup-latency,
>>added of_pm_genpd_init() API, pruned commit text]
>>Signed-off-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
>>[Ulf: Moved around code to make it compile properly, rebased on top
of multiple state support]
>>---
>> .../devicetree/bindings/power/power_domain.txt     | 70
++++++++++++++++++++++
>> 1 file changed, 70 insertions(+)
>>
>>diff --git
a/Documentation/devicetree/bindings/power/power_domain.txt
b/Documentation/devicetree/bindings/power/power_domain.txt
>>index 025b5e7..41e8dda 100644
>>--- a/Documentation/devicetree/bindings/power/power_domain.txt
>>+++ b/Documentation/devicetree/bindings/power/power_domain.txt
>>@@ -29,6 +29,43 @@ Optional properties:
>>    specified by this binding. More details about power domain
specifier are
>>    available in the next section.
>>
>>+- power-states : A phandle of an idle-state that shall be soaked
into a
>>+                generic domain power state.
>
>It's somewhat unfortunate that this gives us two possible locations for
>idle state lists (under the /cpus node and in a pm-domains node),
>especially as it's not clear what would happen were a DT to have both.
>
>I would prefer that we extend the existing bindings such that states
can
>refer to the power domains which they affect.
>
I agree. The CPU idle states have become defined to be specific to CPUs.
PM Domain idle states are generic for any type of domain. I am hoping at
some point, we could converge and use the same idle state, but that
would mean changing the CPU idle states to make it generic.

Outside of CPU idling, I don't fully understand how this will be used,
so it's not clear to me what would need to be made generic. Apologies
for my ignorance there.

There may be non-PSCI ARM v7 CPU domains that may have domain controller
drivers in the kernel. They would not hook into the ARM PSCI frameworks.
It is still cpuidle though.


Apologies also for my ignorance, but I don't think I understand you
here. It sounds to me like the drivers you're describing are currently
unable to get their idle states from DT - is that correct? If so,
perhaps we can implement support using the CPU idle state nodes at
first, then later make them generic so that these non-PSCI domains can
use DT.

At some point, during my development, I did use the arm,idle-state for
domains as well, but the binding definitions were too restrictive for
a generic PM domain.

I would be willing to make the change to CPU idle states to make it
generic and then we could just reference domain and CPU idle states
using the same bindings. Are we okay with that, specifically,
arm,psci-suspend-param? This binding is very restrictive in its
description. What we pass to the platform driver upon choosing a domain
state is very platform specific and therefore has to be generic in its
description.

I was suggesting that for PSCI we should consistently us
arm,psci-suspend-param, not that this should be used for all power
domain state data.

I imagine that mechanisms for powering down power domains will have
varied requirements on data they require (and may require more than can
be encoded in a u32), and I don't think it's best to try to force a
single representation in the DT for that. It would be better to allow
them to define the properties which they require.

The only way to do that is to push the DT parsing to the platform
drivers. In the case of CPU domains controlled by PSCI, we could use the
arm,idle-states but any other generic domain, may need to define their
own bindings and fill up the domain states before initiailizing the domain.

While this approach pushes the onus on to the platform code, I am fine
with it. Is that what you were thinking too?

[1]
http://www.linaro.org/blog/core-dump/energy-aware-scheduling-eas-progress-update/

[2] https://lkml.org/lkml/2015/7/7/754
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
>From 598a7acb19af3763aa17926168e24d8131f3a57d Mon Sep 17 00:00:00 2001
From: Brendan Jackman <brendan.jackman@xxxxxxx>
Date: Mon, 25 Jul 2016 14:26:08 +0100
Subject: [PATCH] cpuidle: dt: Parse states from power domain nodes

This commit allows bindings similar to those in [1] to be used without
requiring CPU PM domains. The cpuidle DT parsing code will first read
idle states from the "cpu-idle-states" property, then walk up the
power-domains tree (assuming that it is indeed a tree) for the CPU and
read idle states from the "power-states" property.

This is a temporary hack to enable a prototype EAS energy model in the
device tree, for which we want to use CPU power-domains to describe
energy costs in a topologically-aware way.

[1] https://patchwork.kernel.org/patch/9193651/
---
 drivers/cpuidle/dt_idle_states.c | 85 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 82 insertions(+), 3 deletions(-)

diff --git a/drivers/cpuidle/dt_idle_states.c b/drivers/cpuidle/dt_idle_states.c
index a5c111b..70fe30b 100644
--- a/drivers/cpuidle/dt_idle_states.c
+++ b/drivers/cpuidle/dt_idle_states.c
@@ -21,6 +21,8 @@
 
 #include "dt_idle_states.h"
 
+#define MAX_PD_DEPTH 8
+
 static int init_state_node(struct cpuidle_state *idle_state,
 			   const struct of_device_id *matches,
 			   struct device_node *state_node)
@@ -92,6 +94,84 @@ static int init_state_node(struct cpuidle_state *idle_state,
 }
 
 /*
+ * First look in cpu-idle-states, then walk up the power-domains tree, returning
+ * the idx'th idle state node encountered, or NULL if it can't be found.
+ *
+ * Returns the device_node pointer with refcount incremented.  Use
+ * of_node_put() on it when done.
+ */
+static struct device_node *get_idle_state_node(struct device_node *cpu_node,
+					       int idx)
+{
+	struct device_node *state_node, *consumer_node;
+	int num_cpu_idle_states, num_pd_idle_states;
+	int idx_pd;
+	unsigned int i;
+
+	state_node = of_parse_phandle(cpu_node, "cpu-idle-states", idx);
+	if (state_node)
+		return state_node;
+
+	num_cpu_idle_states = of_property_count_elems_of_size(cpu_node,
+							      "cpu-idle-states",
+							      4);
+	if (num_cpu_idle_states < 1) {
+		pr_devel("%s: no cpu-idle-states for %s\n", __func__,
+			 cpu_node->name);
+		num_cpu_idle_states = 0;
+	}
+
+	pr_devel("%s: Looking in %s's power domains tree for state %d\n",
+		 __func__, cpu_node->full_name, idx);
+
+	BUG_ON(idx < num_cpu_idle_states);
+	idx_pd = idx - num_cpu_idle_states;
+
+	/* Use a loop counter so we don't infiniloop if there's a cycle in the
+	 * power-domains graph. */
+	consumer_node = cpu_node;
+	for (i = 0; i < MAX_PD_DEPTH; i++) {
+		int num_pd_providers;
+		struct device_node *pd_node = of_parse_phandle(
+			consumer_node, "power-domains", 0);
+		if (!pd_node) {
+			return NULL;
+		}
+
+		/* We're assuming the power-domains graph is a tree, complain if
+		 * not. */
+		num_pd_providers = of_property_count_elems_of_size(
+			consumer_node, "power-domains", 4);
+		BUG_ON(num_pd_providers < 1);
+		if (num_pd_providers > 1)
+			pr_warn("power-domains graph for %s not a tree "
+				"(%s has multiple power-domains)\n",
+				cpu_node->full_name, consumer_node->full_name);
+
+
+		state_node = of_parse_phandle(pd_node, "power-states", idx_pd);
+		if (state_node) {
+			of_node_put(pd_node);
+			return state_node;
+		}
+
+		num_pd_idle_states = of_property_count_elems_of_size(
+			pd_node, "power-states", 4);
+		if (num_pd_idle_states < 1)
+			num_pd_idle_states = 0;
+
+		BUG_ON(idx_pd < num_pd_idle_states);
+		idx_pd -= num_pd_idle_states;
+
+		of_node_put(pd_node);
+		consumer_node = pd_node;
+	}
+
+	WARN(1, "DT CPU power domain graph too deep. Probably has a cycle.");
+	return NULL;
+}
+
+/*
  * Check that the idle state is uniform across all CPUs in the CPUidle driver
  * cpumask
  */
@@ -112,8 +192,7 @@ static bool idle_state_valid(struct device_node *state_node, unsigned int idx,
 	for (cpu = cpumask_next(cpumask_first(cpumask), cpumask);
 	     cpu < nr_cpu_ids; cpu = cpumask_next(cpu, cpumask)) {
 		cpu_node = of_cpu_device_node_get(cpu);
-		curr_state_node = of_parse_phandle(cpu_node, "cpu-idle-states",
-						   idx);
+		curr_state_node = get_idle_state_node(cpu_node, idx);
 		if (state_node != curr_state_node)
 			valid = false;
 
@@ -170,7 +249,7 @@ int dt_init_idle_driver(struct cpuidle_driver *drv,
 	cpu_node = of_cpu_device_node_get(cpumask_first(cpumask));
 
 	for (i = 0; ; i++) {
-		state_node = of_parse_phandle(cpu_node, "cpu-idle-states", i);
+		state_node = get_idle_state_node(cpu_node,i);
 		if (!state_node)
 			break;
 
-- 
1.9.1


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [Linux for Sparc]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux