Re: [RFC 0/8] Stop monitoring disabled devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/04/2020 19:49, Andrzej Pietrasiewicz wrote:
> The current kernel behavior is to keep polling the thermal zone devices
> regardless of their current mode. This is not desired, as all such "disabled"
> devices are meant to be handled by userspace,> so polling them makes no sense.

Thanks for proposing these changes.

I've been (quickly) through the series and the description below. I have
the feeling the series makes more complex while the current code which
would deserve a cleanup.

Why not first:

 - Add a 'mode' field in the thermal zone device
 - Kill all set/get_mode callbacks in the drivers which are duplicated code.
 - Add a function:

 enum thermal_device_mode thermal_zone_get_mode( *tz)
 {
	...
	if (tz->ops->get_mode)
		return tz->ops->get_mode();

	return tz->mode;
 }


 int thermal_zone_set_mode(..*tz, enum thermal_device_mode mode)
 {
	...
	if (tz->ops->set_mode)
		return tz->ops->set_mode(tz, mode);

	tz->mode = mode;

	return 0;
 }

 static inline thermal_zone_enable(... *tz)
 {
	thermal_zone_set_mode(tz, THERMAL_DEVICE_ENABLED);
 }

 static inline thermal_zone_disable(... *tz) {
	thermal_zone_set_mode(tz, THERMAL_DEVICE_DISABLED);
 }

And then when the code is consolidated, use the mode to enable/disable
the polling and continue killing the duplicated code in of-thermal.c and
anywhere else.


> There was an attempt to solve this issue:
> 
> https://lkml.org/lkml/2018/2/26/498
> 
> and it ultimately has not succeeded:
> 
> https://lkml.org/lkml/2018/2/27/910
> 
> This is a new attempt addressing all the relevant drivers, and I have
> identified them with:
> 
> $ git grep "thermal_zone_device_ops" | grep "= {" | cut -f1 -d: | sort | uniq
> 
> The idea is to modify thermal_zone_device_update() and monitor_thermal_zone()
> in such a way that they stop polling a disabled device. To do decide what to
> do they should call ->get_mode() operation of the specialized thermal zone
> device in question (e.g. drivers/acpi/thermal.c's). But here comes problem:
> sometimes a thermal zone device must be initially disabled and becomes enabled
> only after its sensors appear on the system. If such thermal zone's
> ->get_mode() /* in the context of thermal_zone_device_update() or
> monitor_thermal_zone() */ is called _before_ the sensors are available, it will
> be reported as "disabled" and consequently polling it will be ceased. This is
> a change in behavior from userspace's perspective.
> 
> To solve the above described problem I want to introduce the third mode of a
> thermal_zone_device: initial. The idea is that when the device is in its
> initial mode, then its polling will be handled as it is now. This is a good
> thing: should the temperature be just about hitting the critical treshnold
> early during the boot process, it might be too late if we wait for the
> userspace to run to save the system from overheating. The initial mode should
> be reported in sysfs as "enabled" to keep the userspace interface intact.
> From the initial mode there will be two possible transitions: to enabled or
> disabled mode, but there will be no transition back to initial. If the
> transition is from initial to enabled, then keep polling. If the transition is
> from initial to disabled, then stop polling. If the transition is from enabled
> to disabled, then stop polling. The transition from disabled to enabled must
> be handled in a special way: there must be a mandatory call to
> monitor_thermal_zone(), otherwise the polling will not start. If this
> transition is triggeted from sysfs, then it can be easily handled at the
> thermal framework level. However, if drivers call their own ->set_mode()
> operation then they must also call "monitor_thermal_zone()" afterwards.
> The latter being a sensible thing anyway, so perhaps all/most of the drivers
> in question do. The plan for implementation is this:
> 
> - ensure ALL users use symbolic enum names (THERMAL_DEVICE_DISABLED,
> THERMAL_DEVICE_ENABLED) for thermal device mode rather than the numeric
> values of enum thermal_device_mode elements
> - add THERMAL_DEVICE_INITIAL to the said enum making its value 0 (so that
> kzalloc() results in the initial state)
> - modify thermal zone device's mode_show() (thermal framework level) so that
> it reports "enabled" for THERMAL_DEVICE_INITIAL
> - modify thermal zone device's mode_store() (thermal framework level) so that
> it calls monitor_thermal_zone() upon mode change
> - modify ALL thermal drivers so that their code is prepared to return
> THERMAL_DEVICE_INITIAL before they call thermal_zone_device_register(); when
> the invocation of the latter completes then polling is expected to be started
> - verify ALL drivers which call their own ->set_mode() to ensure they do call
> monitor_thermal_zone() afterwards
> - modify thermal_zone_device_update() and monitor_thermal_zone() so that they
> cancel polling for disabled thermal zone devices (but not for those in
> THERMAL_DEVICE_INITIAL mode)
> 
> This RFC series does all the above steps in more or less that order.
> 
> I kindly ask for comments/suggestions/improvements.
> 
> Rebased onto v5.6.
> 
> Andrzej Pietrasiewicz (8):
>   thermal: int3400_thermal: Statically initialize
>     .get_mode()/.set_mode() ops
>   thermal: Properly handle mode values in .set_mode()
>   thermal: Store thermal mode in a dedicated enum
>   thermal: core: Introduce THERMAL_DEVICE_INITIAL
>   thermal: core: Monitor thermal zone after mode change
>   thermal: Set initial state to THERMAL_DEVICE_INITIAL
>   thermal: of: Monitor thermal zone after enabling it
>   thermal: Stop polling DISABLED thermal devices
> 
>  drivers/acpi/thermal.c                        | 28 +++++-----
>  .../ethernet/mellanox/mlxsw/core_thermal.c    | 11 +++-
>  drivers/platform/x86/acerhdf.c                | 17 ++++--
>  drivers/thermal/da9062-thermal.c              |  2 +-
>  drivers/thermal/imx_thermal.c                 |  5 +-
>  .../intel/int340x_thermal/int3400_thermal.c   | 24 ++++-----
>  .../thermal/intel/intel_quark_dts_thermal.c   |  6 ++-
>  drivers/thermal/of-thermal.c                  |  9 +++-
>  drivers/thermal/thermal_core.c                | 52 ++++++++++++++++++-
>  drivers/thermal/thermal_core.h                |  2 +
>  drivers/thermal/thermal_sysfs.c               | 12 +++--
>  include/linux/thermal.h                       |  3 +-
>  12 files changed, 123 insertions(+), 48 deletions(-)
> 


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog



[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux