Hi Daniel Lezcano, > -----Original Message----- > From: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> > Sent: 30 January 2025 10:07 > Subject: Re: [PATCH 2/6] thermal: of: Export non-devres helper to register/unregister thermal zone > > On Thu, Jan 30, 2025 at 11:08:03AM +0200, Claudiu Beznea wrote: > > Hi, Daniel, > > > > On 29.01.2025 19:24, Daniel Lezcano wrote: > > > Hi Claudiu, > > > > > > On Fri, Jan 03, 2025 at 06:38:01PM +0200, Claudiu wrote: > > >> From: Claudiu Beznea <claudiu.beznea.uj@xxxxxxxxxxxxxx> > > >> > > >> On the Renesas RZ/G3S (and other Renesas SoCs, e.g., RZ/G2{L, LC, > > >> UL}), clocks are managed through PM domains. These PM domains, > > >> registered on behalf of the clock controller driver, are configured > > >> with GENPD_FLAG_PM_CLK. In most of the Renesas drivers used by RZ > > >> SoCs, the clocks are enabled/disabled using runtime PM APIs. > > >> > > >> During probe, devices are attached to the PM domain controlling > > >> their clocks. Similarly, during removal, devices are detached from the PM domain. > > >> > > >> The detachment call stack is as follows: > > >> > > >> device_driver_detach() -> > > >> device_release_driver_internal() -> > > >> __device_release_driver() -> > > >> device_remove() -> > > >> platform_remove() -> > > >> dev_pm_domain_detach() > > >> > > >> In the upcoming Renesas RZ/G3S thermal driver, the struct > > >> thermal_zone_device_ops::change_mode API is implemented to > > >> start/stop the thermal sensor unit. Register settings are updated > > >> within the change_mode API. > > >> > > >> In case devres helpers are used for thermal zone > > >> register/unregister the struct thermal_zone_device_ops::change_mode > > >> API is invoked when the driver is unbound. The identified call stack is as follows: > > >> > > >> device_driver_detach() -> > > >> device_release_driver_internal() -> > > >> device_unbind_cleanup() -> > > >> devres_release_all() -> > > >> devm_thermal_of_zone_release() -> > > >> thermal_zone_device_disable() -> > > >> thermal_zone_device_set_mode() -> > > >> rzg3s_thermal_change_mode() > > >> > > >> The device_unbind_cleanup() function is called after the thermal > > >> device is detached from the PM domain (via dev_pm_domain_detach()). > > >> > > >> The rzg3s_thermal_change_mode() implementation calls > > >> pm_runtime_resume_and_get()/pm_runtime_put_autosuspend() > > >> before/after accessing the registers. However, during the unbind > > >> scenario, the > > >> devm_thermal_of_zone_release() is invoked after dev_pm_domain_detach(). > > >> Consequently, the clocks are not enabled, as the device is removed > > >> from the PM domain at this time, leading to an Asynchronous SError Interrupt. > > >> The system cannot be used after this. > > > > > > I've been through the driver before responding to this change. What > > > is the benefit of powering down / up (or clock off / on) the thermal > > > sensor when reading the temperature ? > > > > > > I can understand for disable / enable but I don't get for the > > > classic usage where a governor will be reading the temperature regularly. > > > > I tried to be as power saving as possible both at runtime and after > > the IP is not used anymore as the HW manual doesn't mentioned anything > > about accuracy or implications of disabling the IP clock at runtime. > > We use similar approach (of disabling clocks at runtime) for other IPs > > in the RZ/G3S SoC as well. > > > > > > > > Would the IP need some cycles to capture the temperature accurately > > > after the clock is enabled ? > > > > There is nothing about this mentioned about this in the HW manual of > > the RZ/G3S SoC. The only points mentioned are as described in the driver code: > > - wait at least 3us after each IIO channel read > > - wait at least 30us after enabling the sensor > > - wait at least 50us after setting OE bit in TSU_SM > > > > For this I chose to have it implemented as proposed. > > IMO, disabling/enabling the clock between two reads through the pm runtime may not be a good thing, > especially if the system enters a thermal situation where it has to mitigate. Just a question, You mean to avoid device destruction due to high temperature?? Assuming disabling the clk happens when the temp reaches the boundary and re-enabling of the clk after a time(which involves monitoring the CLK ON bit after enabling it, or a run time enable failure happens), where it exceeds the threshold?? Cheers, Biju