On 9/3/24 04:32, Thangaraj, Senthil Nathan wrote: > Hi Sean, > > Please find my review comments inline. > > Thanks, > Senthil. > >> -----Original Message----- >> From: Sean Anderson <sean.anderson@xxxxxxxxx> >> Sent: Monday, August 12, 2024 2:51 PM >> To: Simek, Michal <michal.simek@xxxxxxx>; linux-arm- >> kernel@xxxxxxxxxxxxxxxxxxx >> Cc: Rob Herring <robh@xxxxxxxxxx>; Conor Dooley <conor+dt@xxxxxxxxxx>; >> Krzysztof Kozlowski <krzk+dt@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; >> devicetree@xxxxxxxxxxxxxxx; Sean Anderson <sean.anderson@xxxxxxxxx> >> Subject: [PATCH 3/3] arm64: zynqmp: Add thermal zones >> >> Add some thermal trip points. We can't undervolt the CPUs to save power >> when we underclock them, so there isn't really a point in throttling them until >> we are about to overheat. As such, the passive trip point is right below the >> critical trip point. >> >> The critical trip point is the extended/industrial-grade maximum junction >> temperature of 100C minus the maximum temperature sensor error of 3.5C >> (in the range -55C to 110C). Automotive- and military-grade parts can go up >> to 125C, but as far as I can tell there is no way to detect them at runtime. >> Userspace can adjust the trip points at runtime, but this may not be viable >> when booting above 100C. I think it's reasonable to ask automotive/military >> users to edit their device trees to bump the trip points, but if that proves to be >> an issue we can always go with no default temperatures. However, that >> wouldn't be too nice for the majority of extended/industrial users. >> >> Signed-off-by: Sean Anderson <sean.anderson@xxxxxxxxx> >> --- >> >> arch/arm64/boot/dts/xilinx/zynqmp.dtsi | 86 >> ++++++++++++++++++++++++++ >> 1 file changed, 86 insertions(+) >> >> diff --git a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi >> b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi >> index 21c1adbaf35f..467f084c6469 100644 >> --- a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi >> +++ b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi >> @@ -18,6 +18,7 @@ >> #include <dt-bindings/interrupt-controller/irq.h> >> #include <dt-bindings/power/xlnx-zynqmp-power.h> >> #include <dt-bindings/reset/xlnx-zynqmp-resets.h> >> +#include <dt-bindings/thermal/thermal.h> >> >> / { >> compatible = "xlnx,zynqmp"; >> @@ -36,6 +37,7 @@ cpus { >> #size-cells = <0>; >> >> cpu0: cpu@0 { >> + #cooling-cells = <2>; >> compatible = "arm,cortex-a53"; >> device_type = "cpu"; >> enable-method = "psci"; >> @@ -46,6 +48,7 @@ cpu0: cpu@0 { >> }; >> >> cpu1: cpu@1 { >> + #cooling-cells = <2>; >> compatible = "arm,cortex-a53"; >> device_type = "cpu"; >> enable-method = "psci"; >> @@ -56,6 +59,7 @@ cpu1: cpu@1 { >> }; >> >> cpu2: cpu@2 { >> + #cooling-cells = <2>; >> compatible = "arm,cortex-a53"; >> device_type = "cpu"; >> enable-method = "psci"; >> @@ -66,6 +70,7 @@ cpu2: cpu@2 { >> }; >> >> cpu3: cpu@3 { >> + #cooling-cells = <2>; >> compatible = "arm,cortex-a53"; >> device_type = "cpu"; >> enable-method = "psci"; >> @@ -406,6 +411,87 @@ ams { >> <&xilinx_ams 27>, <&xilinx_ams 28>, <&xilinx_ams >> 29>; >> }; >> >> + >> + tsens_apu: thermal-sensor-apu { >> + compatible = "generic-adc-thermal"; >> + #thermal-sensor-cells = <0>; >> + io-channels = <&xilinx_ams 7>; >> + io-channel-names = "sensor-channel"; >> + }; >> + >> + tsens_rpu: thermal-sensor-rpu { >> + compatible = "generic-adc-thermal"; >> + #thermal-sensor-cells = <0>; >> + io-channels = <&xilinx_ams 8>; >> + io-channel-names = "sensor-channel"; >> + }; >> + >> + tsens_pl: thermal-sensor-pl { >> + compatible = "generic-adc-thermal"; >> + #thermal-sensor-cells = <0>; >> + io-channels = <&xilinx_ams 20>; >> + io-channel-names = "sensor-channel"; >> + }; >> + >> + thermal-zones { >> + apu-thermal { >> + polling-delay-passive = <1000>; >> + polling-delay = <5000>; > > How did we arrive at these delays, 1000 and 5000 ? could you please clarify ? They're just arbitrary. Feel free to suggest other numbers. I could find no guidance on this matter (or anything else thermal-related). >> + thermal-sensors = <&tsens_apu>; >> + >> + trips { >> + apu_passive: passive { >> + temperature = <93000>; >> + hysteresis = <3500>; >> + type = "passive"; >> + }; >> + >> + apu_critical: critical { >> + temperature = <96500>; >> + hysteresis = <3500>; >> + type = "critical"; >> + }; >> + }; >> + >> + cooling-maps { >> + map { >> + trip = <&apu_passive>; >> + cooling-device = >> + <&cpu0 THERMAL_NO_LIMIT >> THERMAL_NO_LIMIT>, >> + <&cpu1 THERMAL_NO_LIMIT >> THERMAL_NO_LIMIT>, >> + <&cpu2 THERMAL_NO_LIMIT >> THERMAL_NO_LIMIT>, >> + <&cpu3 THERMAL_NO_LIMIT >> THERMAL_NO_LIMIT>; >> + }; >> + }; >> + }; >> + >> + rpu-thermal { >> + polling-delay = <10000>; > > Same questions, how did we come up with number 10000 > >> + thermal-sensors = <&tsens_rpu>; >> + >> + trips { >> + critical { >> + temperature = <96500>; >> + hysteresis = <3500>; >> + type = "critical"; >> + }; > > Any reason why we haven't defined for passive trip point for RPU ? What is there to do? It is up to the RPU software to do something if the RPU is going to overheat. But the more-likely occurrence is that the APU is overheating and the heat is spreading to the RPU. In which case, the APU passive trip point will handle things. >> + }; >> + }; >> + >> + pl-thermal { >> + polling-delay = <10000>; >> + thermal-sensors = <&tsens_pl>; >> + >> + trips { >> + critical { >> + temperature = <96500>; >> + hysteresis = <3500>; >> + type = "critical"; >> + }; >> + }; > Same question, any reason why we haven't defined for passive trip point for PL ? >> + }; >> + }; >> + >> amba: axi { >> compatible = "simple-bus"; >> bootph-all; >> -- >> 2.35.1.1320.gc452695387.dirty >