Re: [PATCH v6 1/6] thermal: add generic cpufreq cooling implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 四, 2012-08-16 at 17:11 +0530, Amit Daniel Kachhap wrote:
> This patchset introduces a new generic cooling device based on cpufreq
> that can be used on non-ACPI platforms.  As a proof of concept, we have
> drivers for the following platforms using this mechanism now:
> 
>  * Samsung Exynos (Exynos4 and Exynos5) in the current patchset.
>  * Freescale i.MX (git://git.linaro.org/people/amitdanielk/linux.git imx6q_thermal)
> 
> There is a small change in cpufreq cooling registration APIs, so a minor
> change is needed for Freescale platforms.
> 
> Brief Description:
> 
> 1) The generic cooling devices code is placed inside driver/thermal/*
>    as placing inside acpi folder will need un-necessary enabling of acpi
>    code.  This code is architecture independent.
> 
> 2) This patchset adds generic cpu cooling low level implementation
>    through frequency clipping.  In future, other cpu related cooling
>    devices may be added here.  An ACPI version of this already exists
>    (drivers/acpi/processor_thermal.c) .But this will be useful for
>    platforms like ARM using the generic thermal interface along with the
>    generic cpu cooling devices.  The cooling device registration API's
>    return cooling device pointers which can be easily binded with the
>    thermal zone trip points.  The important APIs exposed are,
> 
>    a) struct thermal_cooling_device *cpufreq_cooling_register(
>         struct cpumask *clip_cpus)
>    b) void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
> 
> 3) Samsung exynos platform thermal implementation is done using the
>    generic cpu cooling APIs and the new trip type.  The temperature sensor
>    driver present in the hwmon folder(registered as hwmon driver) is moved
>    to thermal folder and registered as a thermal driver.
> 
> A simple data/control flow diagrams is shown below,
> 
> Core Linux thermal <----->  Exynos thermal interface <----- Temperature Sensor
>           |                             |
>          \|/                            |
>   Cpufreq cooling device <---------------
> 
> TODO:
> *Will send the DT enablement patches later after the driver is merged.
> 
> This patch:
> 
> Add support for generic cpu thermal cooling low level implementations
> using frequency scaling up/down based on the registration parameters.
> Different cpu related cooling devices can be registered by the user and
> the binding of these cooling devices to the corresponding trip points can
> be easily done as the registration APIs return the cooling device pointer.
> The user of these APIs are responsible for passing clipping frequency .
> The drivers can also register to recieve notification about any cooling
> action called.
> 
> [akpm@xxxxxxxxxxxxxxxxxxxx: fix comment layout]
> Signed-off-by: Amit Daniel Kachhap <amit.kachhap@xxxxxxxxxx>
> Cc: Guenter Roeck <guenter.roeck@xxxxxxxxxxxx>
> Cc: SangWook Ju <sw.ju@xxxxxxxxxxx>
> Cc: Durgadoss <durgadoss.r@xxxxxxxxx>
> Cc: Len Brown <lenb@xxxxxxxxxx>
> Cc: Jean Delvare <khali@xxxxxxxxxxxx>
> Cc: Kyungmin Park <kmpark@xxxxxxxxxxxxx>
> Cc: Kukjin Kim <kgene.kim@xxxxxxxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Amit Daniel Kachhap <amit.daniel@xxxxxxxxxxx>
> ---
>  Documentation/thermal/cpu-cooling-api.txt |   52 +++
>  drivers/thermal/Kconfig                   |   11 +
>  drivers/thermal/Makefile                  |    1 +
>  drivers/thermal/cpu_cooling.c             |  512 +++++++++++++++++++++++++++++
>  include/linux/cpu_cooling.h               |   79 +++++
>  5 files changed, 655 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/thermal/cpu-cooling-api.txt
>  create mode 100644 drivers/thermal/cpu_cooling.c
>  create mode 100644 include/linux/cpu_cooling.h
> 
> diff --git a/Documentation/thermal/cpu-cooling-api.txt b/Documentation/thermal/cpu-cooling-api.txt
> new file mode 100644
> index 0000000..a1f2a6b
> --- /dev/null
> +++ b/Documentation/thermal/cpu-cooling-api.txt
> @@ -0,0 +1,52 @@
> +CPU cooling APIs How To
> +===================================
> +
> +Written by Amit Daniel Kachhap <amit.kachhap@xxxxxxxxxx>
> +
> +Updated: 12 May 2012
> +
> +Copyright (c)  2012 Samsung Electronics Co., Ltd(http://www.samsung.com)
> +
> +0. Introduction
> +
> +The generic cpu cooling(freq clipping) provides registration/unregistration APIs
> +to the caller. The binding of the cooling devices to the trip point is left for
> +the user. The registration APIs returns the cooling device pointer.
> +
> +1. cpu cooling APIs
> +
> +1.1 cpufreq registration/unregistration APIs
> +1.1.1 struct thermal_cooling_device *cpufreq_cooling_register(
> +	struct cpumask *clip_cpus)
> +
> +    This interface function registers the cpufreq cooling device with the name
> +    "thermal-cpufreq-%x". This api can support multiple instances of cpufreq
> +    cooling devices.
> +
> +   clip_cpus: cpumask of cpus where the frequency constraints will happen.
> +
> +1.1.2 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
> +
> +    This interface function unregisters the "thermal-cpufreq-%x" cooling device.
> +
> +    cdev: Cooling device pointer which has to be unregistered.
> +
> +
> +1.2 CPU cooling action notifier register/unregister interface
> +1.2.1 int cputherm_register_notifier(struct notifier_block *nb,
> +	unsigned int list)
> +
> +    This interface registers a driver with cpu cooling layer. The driver will
> +    be notified when any cpu cooling action is called.
> +
> +    nb: notifier function to register
> +    list: CPUFREQ_COOLING_START or CPUFREQ_COOLING_STOP
> +
> +1.2.2 int cputherm_unregister_notifier(struct notifier_block *nb,
> +	unsigned int list)
> +
> +    This interface registers a driver with cpu cooling layer. The driver will
> +    be notified when any cpu cooling action is called.
> +
> +    nb: notifier function to register
> +    list: CPUFREQ_COOLING_START or CPUFREQ_COOLING_STOP

what are these two APIs used for?
I did not see they are used in your patch set, do I miss something?

> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> index 7dd8c34..996003b 100644
> --- a/drivers/thermal/Kconfig
> +++ b/drivers/thermal/Kconfig
> @@ -19,6 +19,17 @@ config THERMAL_HWMON
>  	depends on HWMON=y || HWMON=THERMAL
>  	default y
>  
> +config CPU_THERMAL
> +	bool "generic cpu cooling support"
> +	depends on THERMAL && CPU_FREQ
> +	help
> +	  This implements the generic cpu cooling mechanism through frequency
> +	  reduction, cpu hotplug and any other ways of reducing temperature. An
> +	  ACPI version of this already exists(drivers/acpi/processor_thermal.c).
> +	  This will be useful for platforms using the generic thermal interface
> +	  and not the ACPI interface.
> +	  If you want this support, you should say Y here.
> +
>  config SPEAR_THERMAL
>  	bool "SPEAr thermal sensor driver"
>  	depends on THERMAL
> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
> index fd9369a..aae59ad 100644
> --- a/drivers/thermal/Makefile
> +++ b/drivers/thermal/Makefile
> @@ -3,5 +3,6 @@
>  #
>  
>  obj-$(CONFIG_THERMAL)		+= thermal_sys.o
> +obj-$(CONFIG_CPU_THERMAL)		+= cpu_cooling.o
>  obj-$(CONFIG_SPEAR_THERMAL)		+= spear_thermal.o
>  obj-$(CONFIG_RCAR_THERMAL)	+= rcar_thermal.o
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> new file mode 100644
> index 0000000..c42e557
> --- /dev/null
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -0,0 +1,512 @@
> +/*
> + *  linux/drivers/thermal/cpu_cooling.c
> + *
> + *  Copyright (C) 2012	Samsung Electronics Co., Ltd(http://www.samsung.com)
> + *  Copyright (C) 2012  Amit Daniel <amit.kachhap@xxxxxxxxxx>
> + *
> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; version 2 of the License.
> + *
> + *  This program is distributed in the hope that it will be useful, but
> + *  WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + *  General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License along
> + *  with this program; if not, write to the Free Software Foundation, Inc.,
> + *  59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
> + *
> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> + */
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/thermal.h>
> +#include <linux/platform_device.h>
> +#include <linux/cpufreq.h>
> +#include <linux/err.h>
> +#include <linux/slab.h>
> +#include <linux/cpu.h>
> +#include <linux/cpu_cooling.h>
> +
> +/**
> + * struct cpufreq_cooling_device
> + * @id: unique integer value corresponding to each cpufreq_cooling_device
> + *	registered.
> + * @cool_dev: thermal_cooling_device pointer to keep track of the the
> + *	egistered cooling device.
> + * @cpufreq_state: integer value representing the current state of cpufreq
> + *	cooling	devices.
> + * @cpufreq_val: integer value representing the absolute value of the clipped
> + *	frequency.
> + * @allowed_cpus: all the cpus involved for this cpufreq_cooling_device.
> + * @node: list_head to link all cpufreq_cooling_device together.
> + *
> + * This structure is required for keeping information of each
> + * cpufreq_cooling_device registered as a list whose head is represented by
> + * cooling_cpufreq_list. In order to prevent corruption of this list a
> + * mutex lock cooling_cpufreq_lock is used.
> + */
> +struct cpufreq_cooling_device {
> +	int id;
> +	struct thermal_cooling_device *cool_dev;
> +	unsigned int cpufreq_state;
> +	unsigned int cpufreq_val;
> +	struct cpumask allowed_cpus;
> +	struct list_head node;
> +};
> +static LIST_HEAD(cooling_cpufreq_list);
> +static DEFINE_IDR(cpufreq_idr);
> +
> +static struct mutex cooling_cpufreq_lock;
> +
> +/* notify_table passes value to the CPUFREQ_ADJUST callback function. */
> +#define NOTIFY_INVALID NULL
> +struct cpufreq_cooling_device *notify_device;
> +
> +/* Head of the blocking notifier chain to inform about frequency clamping */
> +static BLOCKING_NOTIFIER_HEAD(cputherm_state_notifier_list);
> +
> +/**
> + * get_idr - function to get a unique id.
> + * @idr: struct idr * handle used to create a id.
> + * @id: int * value generated by this function.
> + */
> +static int get_idr(struct idr *idr, int *id)
> +{
> +	int err;
> +again:
> +	if (unlikely(idr_pre_get(idr, GFP_KERNEL) == 0))
> +		return -ENOMEM;
> +
> +	mutex_lock(&cooling_cpufreq_lock);
> +	err = idr_get_new(idr, NULL, id);
> +	mutex_unlock(&cooling_cpufreq_lock);
> +
> +	if (unlikely(err == -EAGAIN))
> +		goto again;
> +	else if (unlikely(err))
> +		return err;
> +
> +	*id = *id & MAX_ID_MASK;
> +	return 0;
> +}
> +
> +/**
> + * release_idr - function to free the unique id.
> + * @idr: struct idr * handle used for creating the id.
> + * @id: int value representing the unique id.
> + */
> +static void release_idr(struct idr *idr, int id)
> +{
> +	mutex_lock(&cooling_cpufreq_lock);
> +	idr_remove(idr, id);
> +	mutex_unlock(&cooling_cpufreq_lock);
> +}
> +
> +/**
> + * cputherm_register_notifier - Register a notifier with cpu cooling interface.
> + * @nb:	struct notifier_block * with callback info.
> + * @list: integer value for which notification is needed. possible values are
> + *	CPUFREQ_COOLING_START and CPUFREQ_COOLING_STOP.
> + *
> + * This exported function registers a driver with cpu cooling layer. The driver
> + * will be notified when any cpu cooling action is called.
> + */
> +int cputherm_register_notifier(struct notifier_block *nb, unsigned int list)
> +{
> +	int ret = 0;
> +
> +	switch (list) {
> +	case CPUFREQ_COOLING_START:
> +	case CPUFREQ_COOLING_STOP:
> +		ret = blocking_notifier_chain_register(
> +				&cputherm_state_notifier_list, nb);
> +		break;
> +	default:
> +		ret = -EINVAL;
> +	}
> +	return ret;
> +}
> +EXPORT_SYMBOL(cputherm_register_notifier);
> +
> +/**
> + * cputherm_unregister_notifier - Un-register a notifier.
> + * @nb:	struct notifier_block * with callback info.
> + * @list: integer value for which notification is needed. values possible are
> + *	CPUFREQ_COOLING_START or CPUFREQ_COOLING_STOP.
> + *
> + * This exported function un-registers a driver with cpu cooling layer.
> + */
> +int cputherm_unregister_notifier(struct notifier_block *nb, unsigned int list)
> +{
> +	int ret = 0;
> +
> +	switch (list) {
> +	case CPUFREQ_COOLING_START:
> +	case CPUFREQ_COOLING_STOP:
> +		ret = blocking_notifier_chain_unregister(
> +				&cputherm_state_notifier_list, nb);
> +		break;
> +	default:
> +		ret = -EINVAL;
> +	}
> +	return ret;
> +}
> +EXPORT_SYMBOL(cputherm_unregister_notifier);
> +
> +/* Below code defines functions to be used for cpufreq as cooling device */
> +
> +/**
> + * is_cpufreq_valid - function to check if a cpu has frequency transition policy.
> + * @cpu: cpu for which check is needed.
> + */
> +static int is_cpufreq_valid(int cpu)
> +{
> +	struct cpufreq_policy policy;
> +	return !cpufreq_get_policy(&policy, cpu);
> +}
> +
> +/**
> + * get_cpu_frequency - get the absolute value of frequency from level.
> + * @cpu: cpu for which frequency is fetched.
> + * @level: level of frequency of the CPU
> + *	e.g level=1 --> 1st MAX FREQ, LEVEL=2 ---> 2nd MAX FREQ, .... etc
> + */
> +static unsigned int get_cpu_frequency(unsigned int cpu, unsigned long level)
> +{
> +	int ret = 0, i = 0;
> +	unsigned long level_index;
> +	bool descend = false;
> +	struct cpufreq_frequency_table *table =
> +					cpufreq_frequency_get_table(cpu);
> +	if (!table)
> +		return ret;
> +
> +	while (table[i].frequency != CPUFREQ_TABLE_END) {
> +		if (table[i].frequency == CPUFREQ_ENTRY_INVALID)
> +			continue;
> +
> +		/*check if table in ascending or descending order*/
> +		if ((table[i + 1].frequency != CPUFREQ_TABLE_END) &&
> +			(table[i + 1].frequency < table[i].frequency)
> +			&& !descend) {
> +			descend = true;
> +		}
> +
> +		/*return if level matched and table in descending order*/
> +		if (descend && i == level)
> +			return table[i].frequency;
> +		i++;
> +	}
> +	i--;
> +
> +	if (level > i || descend)
> +		return ret;
> +	level_index = i - level;
> +
> +	/*Scan the table in reverse order and match the level*/
> +	while (i >= 0) {
> +		if (table[i].frequency == CPUFREQ_ENTRY_INVALID)
> +			continue;
> +		/*return if level matched*/
> +		if (i == level_index)
> +			return table[i].frequency;
> +		i--;
> +	}
> +	return ret;
> +}
> +
> +/**
> + * cpufreq_apply_cooling - function to apply frequency clipping.
> + * @cpufreq_device: cpufreq_cooling_device pointer containing frequency
> + *	clipping data.
> + * @cooling_state: value of the cooling state.
> + */
> +static int cpufreq_apply_cooling(struct cpufreq_cooling_device *cpufreq_device,
> +				unsigned long cooling_state)
> +{
> +	unsigned int event, cpuid, clip_freq;
> +	struct cpumask *maskPtr = &cpufreq_device->allowed_cpus;
> +	unsigned int cpu = cpumask_any(maskPtr);
> +
> +
> +	/* Check if the old cooling action is same as new cooling action */
> +	if (cpufreq_device->cpufreq_state == cooling_state)
> +		return 0;
> +
> +	clip_freq = get_cpu_frequency(cpu, cooling_state);
> +	if (!clip_freq)
> +		return -EINVAL;
> +
> +	cpufreq_device->cpufreq_state = cooling_state;
> +	cpufreq_device->cpufreq_val = clip_freq;
> +	notify_device = cpufreq_device;
> +
> +	if (cooling_state != 0)
> +		event = CPUFREQ_COOLING_START;
> +	else
> +		event = CPUFREQ_COOLING_STOP;
> +
> +	blocking_notifier_call_chain(&cputherm_state_notifier_list,
> +						event, &clip_freq);
> +
> +	for_each_cpu(cpuid, maskPtr) {
> +		if (is_cpufreq_valid(cpuid))
> +			cpufreq_update_policy(cpuid);
> +	}
> +
> +	notify_device = NOTIFY_INVALID;
> +
> +	return 0;
> +}
> +
> +/**
> + * cpufreq_thermal_notifier - notifier callback for cpufreq policy change.
> + * @nb:	struct notifier_block * with callback info.
> + * @event: value showing cpufreq event for which this function invoked.
> + * @data: callback-specific data
> + */
> +static int cpufreq_thermal_notifier(struct notifier_block *nb,
> +					unsigned long event, void *data)
> +{
> +	struct cpufreq_policy *policy = data;
> +	unsigned long max_freq = 0;
> +
> +	if (event != CPUFREQ_ADJUST || notify_device == NOTIFY_INVALID)
> +		return 0;
> +
> +	if (cpumask_test_cpu(policy->cpu, &notify_device->allowed_cpus))
> +		max_freq = notify_device->cpufreq_val;
> +
> +	/* Never exceed user_policy.max*/
> +	if (max_freq > policy->user_policy.max)
> +		max_freq = policy->user_policy.max;
> +
> +	if (policy->max != max_freq)
> +		cpufreq_verify_within_limits(policy, 0, max_freq);
> +
> +	return 0;
> +}
> +
> +/*
> + * cpufreq cooling device callback functions are defined below
> + */
> +
> +/**
> + * cpufreq_get_max_state - callback function to get the max cooling state.
> + * @cdev: thermal cooling device pointer.
> + * @state: fill this variable with the max cooling state.
> + */
> +static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
> +				 unsigned long *state)
> +{
> +	int ret = -EINVAL, i = 0;
> +	struct cpufreq_cooling_device *cpufreq_device;
> +	struct cpumask *maskPtr;
> +	unsigned int cpu;
> +	struct cpufreq_frequency_table *table;
> +
> +	mutex_lock(&cooling_cpufreq_lock);
> +	list_for_each_entry(cpufreq_device, &cooling_cpufreq_list, node) {
> +		if (cpufreq_device && cpufreq_device->cool_dev == cdev)
> +			break;
> +	}
> +	if (cpufreq_device == NULL)
> +		goto return_get_max_state;
> +
> +	maskPtr = &cpufreq_device->allowed_cpus;
> +	cpu = cpumask_any(maskPtr);
> +	table = cpufreq_frequency_get_table(cpu);
> +	if (!table) {
> +		*state = 0;
> +		ret = 0;
> +		goto return_get_max_state;
> +	}
> +
> +	while (table[i].frequency != CPUFREQ_TABLE_END) {
> +		if (table[i].frequency == CPUFREQ_ENTRY_INVALID)
> +			continue;
> +		i++;
> +	}
> +	if (i > 0) {
> +		*state = --i;
> +		ret = 0;
> +	}
> +
> +return_get_max_state:
> +	mutex_unlock(&cooling_cpufreq_lock);
> +	return ret;
> +}
> +
> +/**
> + * cpufreq_get_cur_state - callback function to get the current cooling state.
> + * @cdev: thermal cooling device pointer.
> + * @state: fill this variable with the current cooling state.
> + */
> +static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
> +				 unsigned long *state)
> +{
> +	int ret = -EINVAL;
> +	struct cpufreq_cooling_device *cpufreq_device;
> +
> +	mutex_lock(&cooling_cpufreq_lock);
> +	list_for_each_entry(cpufreq_device, &cooling_cpufreq_list, node) {
> +		if (cpufreq_device && cpufreq_device->cool_dev == cdev) {
> +			*state = cpufreq_device->cpufreq_state;
> +			ret = 0;
> +			break;
> +		}
> +	}
> +	mutex_unlock(&cooling_cpufreq_lock);
> +

as cpufreq may be changed in other places, e.g. via sysfs I/F, we should
use the current cpu frequency to get the REAL cooling state, rather than
using a cached value.

thanks,
rui



_______________________________________________
lm-sensors mailing list
lm-sensors@xxxxxxxxxxxxxx
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors



[Index of Archives]     [Linux Kernel]     [Linux Hardware Monitoring]     [Linux USB Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]

  Powered by Linux