On 3.01.2025 3:38 PM, Neil Armstrong wrote: > On the SM8650 platform, the dynamic clock and voltage scaling (DCVS) for > the CPUs and GPU is handled by hardware & firmware using factory and > form-factor determined parameters in order to maximize frequency while > keeping the temperature way below the junction temperature where the SoC > would experience a thermal shutdown if not permanent damages. > > On the other side, the High Level Ooperating System (HLOS), like Linux, > is able to adjust the CPU and GPU frequency using the internal SoC > temperature sensors (here tsens) and it's UP/LOW interrupts, but it > effectly does the same work twice in an less effective manner. > > Let's take the Hardware & Firmware action in account and design the > thermal zones trip points and cooling devices mapping to use the HLOS > as a safety warant in case the platform experiences a temperature surge > to helpfully avoid a thermal shutdown and handle the scenario gracefully. > > On the CPU side, the LMh hardware does the DCVS control loop, so > let's set higher trip points temperatures closer to the junction > and thermal shutdown temperatures and add some idle injection cooling > device with 100% duty cycle for each CPU that would act as emergency > action to avoid the thermal shutdown. > > On the GPU side, the GPU Management Unit (GMU) acts as the DCVS > control loop, but since we can't perform idle injection, let's > also set higher trip points temperatures closer to the junction > and thermal shutdown temperatures to reduce the GPU frequency only > as an emergency action before the thermal shutdown. > > Those 2 changes optimizes the thermal management design by avoiding > concurrent thermal management, calculations & avoidable interrupts > by moving the HLOS management to a last resort emergency if the > Hardware & Firmwares fails to avoid a thermal shutdown. > > Signed-off-by: Neil Armstrong <neil.armstrong@xxxxxxxxxx> > --- Got any numbers to back this? Konrad