On Wed, Dec 08, 2021 at 07:29:43PM +0100, Daniel Lezcano wrote: > > Hi Rob, > > thanks for taking the time to review the bindings. > > On 07/12/2021 20:58, Rob Herring wrote: > > On Sun, Dec 5, 2021 at 5:16 PM Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> wrote: > >> > >> The proposed bindings are describing a set of powerzones. > >> > >> A power zone is the logical name for a component which is capable of > >> power capping and where we can measure the power consumption. > > > > How is the power consumption measured? I don't see anything in the > > binding allowing for that. > > Mmh, good point. > > It is based on the energy model which is built from the > "dynamic-power-coefficient" but this one provides only for CPUs and GPUs > ATM. That's more a calculated power than measured. Measured implies some h/w or firmware interface, but there isn't one it seems. Wouldn't anything with "dynamic-power-coefficient" be a powerzone implicitly? > In the future, SCMI will provide get/set power/level > > What would you suggest? > > >> A power zone can aggregate several power zones in terms of power > >> measurement and power limitations. That allows to apply power > >> constraint to a group of components and let the system balance the > >> allocated power in order to comply with the constraint. > >> > >> The ARM System Control and Management Interface (SCMI) can provide a > >> power zone description. > > > > Instead of DT? > > It can use DT or SCMI protocol. That is what I understood from the white > paper [1] page 6 > > Lukasz may confirm / elaborate ? Do you mean the 'performance protocol'? Because we have a binding for that too! And that is exactly the problem. No one, AFAICT, looks at all aspects of power/performance/thermal together. > >> The powerzone semantic is also found on the Intel platform with the > >> RAPL register. > > > > That means nothing to me... > > The Running Average Power Limit [2]. Each powerzone has a RAPL register > where you can read the power and set the power limit. > > >> The Linux kernel powercap framework deals with the powerzones: > >> > >> https://www.kernel.org/doc/html/latest/power/powercap/powercap.html > >> > >> The powerzone can also represent a group of children powerzones, hence > >> the description can result on a hierarchy. Such hierarchy already > >> exists with the hardware or can be represented and computed from the > >> kernel. > >> > >> The hierarchical description was initially proposed but not desired > >> given there are other descriptions like the power domain proposing > >> almost the same description. > >> > >> https://lore.kernel.org/all/CAL_JsqLuLcHj7525tTUmh7pLqe7T2j6UcznyhV7joS8ipyb_VQ@xxxxxxxxxxxxxx/ > >> > >> The description gives the power constraint dependencies to apply on a > >> specific group of logically or physically aggregated devices. They do > >> not represent the physical location or the power domains of the SoC > >> even if the description could be similar. > >> > >> Cc: Arnd Bergmann <arnd@xxxxxxxx> > >> Cc: Ulf Hansson <ulf.hansson@xxxxxxxxxx> > >> Cc: Rob Herring <robh+dt@xxxxxxxxxx> > >> Reviewed-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx> > >> Signed-off-by: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> > >> --- > >> V3: > >> - Removed required property 'compatible' > >> - Removed powerzone-cells from the topmost node > >> - Removed powerzone-cells from cpus 'consumers' in example > >> - Set additionnal property to false > >> V2: > >> - Added pattern properties and stick to powerzone-* > >> - Added required property compatible and powerzone-cells > >> - Added additionnal property > >> - Added compatible > >> - Renamed to 'powerzones' > >> - Added missing powerzone-cells to the topmost node > >> - Fixed errors reported by 'make DT_CHECKER_FLAGS=-m dt_binding_check' > >> V1: Initial post > >> --- > >> .../devicetree/bindings/power/powerzones.yaml | 97 +++++++++++++++++++ > >> 1 file changed, 97 insertions(+) > >> create mode 100644 Documentation/devicetree/bindings/power/powerzones.yaml > >> > >> diff --git a/Documentation/devicetree/bindings/power/powerzones.yaml b/Documentation/devicetree/bindings/power/powerzones.yaml > >> new file mode 100644 > >> index 000000000000..ddb790acfea6 > >> --- /dev/null > >> +++ b/Documentation/devicetree/bindings/power/powerzones.yaml > >> @@ -0,0 +1,97 @@ > >> +# SPDX-License-Identifier: GPL-2.0 > > > > New bindings should be dual licensed (add BSD-2-Clause). > > > >> +%YAML 1.2 > >> +--- > >> +$id: http://devicetree.org/schemas/power/powerzones.yaml# > >> +$schema: http://devicetree.org/meta-schemas/core.yaml# > >> + > >> +title: Power zones description > >> + > >> +maintainers: > >> + - Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> > >> + > >> +description: |+ > >> + > >> + A System on Chip contains a multitude of active components and each > >> + of them is a source of heat. Even if a temperature sensor is not > >> + present, a source of heat can be controlled by acting on the > >> + consumed power via different techniques. > >> + > >> + A powerzone describes a component or a group of components where we > >> + can control the maximum power consumption. For instance, a group of > >> + CPUs via the performance domain, a LCD screen via the brightness, > >> + etc ... > >> + > >> + Different components when they are used together can significantly > >> + increase the overall temperature, so the description needs to > >> + reflect this dependency in order to assign a power budget for a > >> + group of powerzones. > >> + > >> + This description is done via a hierarchy and the DT reflects it. It > >> + does not represent the physical location or a topology, eg. on a > >> + big.Little system, the little CPUs may not be represented as they do > >> + not contribute significantly to the heat, however the GPU can be > >> + tied with the big CPUs as they usually have a connection for > >> + multimedia or game workloads. > > > > Can't most of this just be assumed. We have some DT data already for > > capacity and power per mhz along with opp tables. Isn't that enough > > information? > > We have a lot of information already and that is the reason why there is > few information in the description ATM. We need to describe what is a > powerzone and the constraints hierarchy between the powerzones. > > The hierarchy could be in the hardware and immutable like the RAPL as > described above which has a RAPL per package, per memory and one on top > of them reporting their energy consumption. > > Here we want to describe how we want to aggregate the powerzones, so the > power constraints will be hierarchically described. > > > The correlation with CPU and GPU usage is totally workload dependent > > which has nothing to do with DT. > > I was probably unclear, IMO it is platform specific. We can debate that, but it shouldn't come down our opinions. Where are the multiple platforms that show this is different and that we must describe the difference? > For example, let's imagine we have a *thermal* sensor between the Bigs > and the GPU. There is no way to know which one is contributing and how > to mitigate them. So once the h/w is not sharing a thermal sensor, we don't need the powerzone binding and go back to thermal zones? Sounds like thermal zones need to be extended to work without a sensor. > But if we know the sustainable power for the big+gpu is eg. 5000mW, then > we can group them under the same powerzone parent and set its power to > the sustainable one. From there it is possible to ensure the power limit > and act on the power for each of them. > > > Nor it is platform specific really. > > The problem is we have devices which are powerzones (CPU, GPU, screen > backlight, memory, DSP, ...) and AFAICT they can be described in the DT > as such (may be just with a property), right? Right, I would think that is "dynamic-power-coefficient" plus an OPP table for the devices. > Unfortunately, we have only a part of the description because we don't > have the relationship between them. Can this relationship be described > in the DT? I'd rather see this taken as far as possible without DT and when things don't work across multiple platforms look at what needs to be in DT. Or say the interface must be SCMI and that has to provide everything you need. It wouldn't be the first Arm spec that had to be revised because the DT binding was rejected (FF-A). Rob