Hi eduardo, Again thanks for the review. On 7 February 2012 00:25, Eduardo Valentin <eduardo.valentin@xxxxxx> wrote: > Hello Amit, > > On Tue, Dec 13, 2011 at 08:43:16PM +0530, Amit Daniel Kachhap wrote: >> This patch adds support for generic cpu thermal cooling low level >> implementations using frequency scaling and cpuhotplugg currently. >> Different cpu related cooling devices can be registered by the >> user and the binding of these cooling devices to the corresponding >> trip points can be easily done as the registration API's return the >> cooling device pointer. >> >> Signed-off-by: Amit Daniel Kachhap <amit.kachhap@xxxxxxxxxx> >> --- >> Documentation/thermal/cpu-cooling-api.txt | 52 +++++ >> drivers/thermal/Kconfig | 11 + >> drivers/thermal/Makefile | 1 + >> drivers/thermal/cpu_cooling.c | 302 +++++++++++++++++++++++++++++ >> include/linux/cpu_cooling.h | 45 +++++ >> 5 files changed, 411 insertions(+), 0 deletions(-) >> create mode 100644 Documentation/thermal/cpu-cooling-api.txt >> create mode 100644 drivers/thermal/cpu_cooling.c >> create mode 100644 include/linux/cpu_cooling.h >> >> diff --git a/Documentation/thermal/cpu-cooling-api.txt b/Documentation/thermal/cpu-cooling-api.txt >> new file mode 100644 >> index 0000000..d30b4f2 >> --- /dev/null >> +++ b/Documentation/thermal/cpu-cooling-api.txt >> @@ -0,0 +1,52 @@ >> +CPU cooling api's How To >> +=================================== >> + >> +Written by Amit Daniel Kachhap <amit.kachhap@xxxxxxxxxx> >> + >> +Updated: 13 Dec 2011 >> + >> +Copyright (c) 2011 Samsung Electronics Co., Ltd(http://www.samsung.com) >> + >> +0. Introduction >> + >> +The generic cpu cooling(freq clipping, cpuhotplug) provides >> +registration/unregistration api's to the user. The binding of the cooling >> +devices to the trip types is left for the user. The registration api's returns >> +the cooling device pointer. >> + >> +1. cpufreq cooling api's >> + >> +1.1 cpufreq registration api's >> +1.1.1 struct thermal_cooling_device *cpufreq_cooling_register( >> + struct freq_pctg_table *tab_ptr, unsigned int tab_size, >> + const struct cpumask *mask_val) >> + >> + This interface function registers the cpufreq cooling device with the name >> + "thermal-cpufreq". >> + >> + tab_ptr: The table containing the percentage of frequency to be clipped for >> + each cooling state. >> + .freq_clip_pctg[NR_CPUS]:Percentage of frequency to be clipped for each >> + cpu. >> + .polling_interval: polling interval for this cooling state. >> + tab_size: the total number of cooling state. >> + mask_val: all the allowed cpu's where frequency clipping can happen. >> + >> +1.1.2 void cpufreq_cooling_unregister(void) >> + >> + This interface function unregisters the "thermal-cpufreq" cooling device. >> + >> + >> +1.2 cpuhotplug registration api's >> + >> +1.2.1 struct thermal_cooling_device *cpuhotplug_cooling_register( >> + const struct cpumask *mask_val) >> + >> + This interface function registers the cpuhotplug cooling device with the name >> + "thermal-cpuhotplug". >> + >> + mask_val: all the allowed cpu's which can be hotplugged out. >> + >> +1.1.2 void cpuhotplug_cooling_unregister(void) >> + >> + This interface function unregisters the "thermal-cpuhotplug" cooling device. >> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig >> index f7f71b2..298c1cd 100644 >> --- a/drivers/thermal/Kconfig >> +++ b/drivers/thermal/Kconfig >> @@ -18,3 +18,14 @@ config THERMAL_HWMON >> depends on THERMAL >> depends on HWMON=y || HWMON=THERMAL >> default y >> + >> +config CPU_THERMAL >> + bool "generic cpu cooling support" >> + depends on THERMAL >> + help >> + This implements the generic cpu cooling mechanism through frequency >> + reduction, cpu hotplug and any other ways of reducing temperature. An >> + ACPI version of this already exists(drivers/acpi/processor_thermal.c). >> + This will be useful for platforms using the generic thermal interface >> + and not the ACPI interface. >> + If you want this support, you should say Y or M here. >> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile >> index 31108a0..655cbc4 100644 >> --- a/drivers/thermal/Makefile >> +++ b/drivers/thermal/Makefile >> @@ -3,3 +3,4 @@ >> # >> >> obj-$(CONFIG_THERMAL) += thermal_sys.o >> +obj-$(CONFIG_CPU_THERMAL) += cpu_cooling.o >> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c >> new file mode 100644 >> index 0000000..cdd148c >> --- /dev/null >> +++ b/drivers/thermal/cpu_cooling.c >> @@ -0,0 +1,302 @@ >> +/* >> + * linux/drivers/thermal/cpu_cooling.c >> + * >> + * Copyright (C) 2011 Samsung Electronics Co., Ltd(http://www.samsung.com) >> + * Copyright (C) 2011 Amit Daniel <amit.kachhap@xxxxxxxxxx> >> + * >> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> + * This program is free software; you can redistribute it and/or modify >> + * it under the terms of the GNU General Public License as published by >> + * the Free Software Foundation; version 2 of the License. >> + * >> + * This program is distributed in the hope that it will be useful, but >> + * WITHOUT ANY WARRANTY; without even the implied warranty of >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + * General Public License for more details. >> + * >> + * You should have received a copy of the GNU General Public License along >> + * with this program; if not, write to the Free Software Foundation, Inc., >> + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. >> + * >> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> + */ >> +#include <linux/kernel.h> >> +#include <linux/module.h> >> +#include <linux/thermal.h> >> +#include <linux/platform_device.h> >> +#include <linux/cpufreq.h> >> +#include <linux/err.h> >> +#include <linux/slab.h> >> +#include <linux/cpu.h> >> +#include <linux/cpu_cooling.h> >> + >> +#ifdef CONFIG_CPU_FREQ >> +struct cpufreq_cooling_device { >> + struct thermal_cooling_device *cool_dev; >> + struct freq_pctg_table *tab_ptr; >> + unsigned int tab_size; >> + unsigned int cpufreq_state; >> + const struct cpumask *allowed_cpus; >> +}; >> + >> +static struct cpufreq_cooling_device *cpufreq_device; >> + >> +/*Below codes defines functions to be used for cpufreq as cooling device*/ >> +static bool is_cpufreq_valid(int cpu) >> +{ >> + struct cpufreq_policy policy; >> + if (!cpufreq_get_policy(&policy, cpu)) >> + return true; >> + return false; >> +} >> + >> +static int cpufreq_apply_cooling(int cooling_state) >> +{ > > Why do you need cooling_state to be signed? You used it always against > unsigned values. Besides, the thermal api imposes unsigned long. Yes, unsigned long is a better way. > >> + int cpuid, this_cpu = smp_processor_id(); >> + >> + if (!is_cpufreq_valid(this_cpu)) >> + return 0; >> + >> + if (cooling_state > cpufreq_device->tab_size) >> + return -EINVAL; >> + >> + /*Check if last cooling level is same as current cooling level*/ >> + if (cpufreq_device->cpufreq_state == cooling_state) >> + return 0; >> + >> + cpufreq_device->cpufreq_state = cooling_state; >> + >> + for_each_cpu(cpuid, cpufreq_device->allowed_cpus) { >> + if (is_cpufreq_valid(cpuid)) >> + cpufreq_update_policy(cpuid); >> + } >> + >> + return 0; >> +} >> + >> +static int thermal_cpufreq_notifier(struct notifier_block *nb, >> + unsigned long event, void *data) >> +{ >> + struct cpufreq_policy *policy = data; >> + struct freq_pctg_table *th_table; >> + unsigned long max_freq = 0; >> + unsigned int cpu = policy->cpu, th_pctg = 0, level; >> + >> + if (event != CPUFREQ_ADJUST) >> + return 0; >> + >> + level = cpufreq_device->cpufreq_state; >> + >> + if (level > 0) { >> + th_table = >> + &(cpufreq_device->tab_ptr[level - 1]); >> + th_pctg = th_table->freq_clip_pctg[cpu]; >> + } >> + >> + max_freq = >> + (policy->cpuinfo.max_freq * (100 - th_pctg)) / 100; > > > Might be interesting to extend this a bit. You could have the > above as default policy setup, but you could also allow people > to change this behavior. For instance, adding some callback option > while registering the cooling device (or a notifier), which would > determine the max_freq for the generic layer. The mechanism to > deploy max_freq into the system must be generic enough with cpufreq > APIs, but the decision to which max_freq to use, could be specific. > Don't you think? I agree that there will always some constraints on the system to cap the max frequency. But these thermal clippings is called for rare scenarios. Anyway, Your idea is worth considering. I will check if cpufreq has any generic way of achieving this. > >> + >> + cpufreq_verify_within_limits(policy, 0, max_freq); >> + >> + return 0; >> +} >> + >> +/* >> + * cpufreq cooling device callback functions >> + */ >> +static int cpufreq_get_max_state(struct thermal_cooling_device *cdev, >> + unsigned long *state) >> +{ >> + *state = cpufreq_device->tab_size; >> + return 0; >> +} >> + >> +static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev, >> + unsigned long *state) >> +{ >> + *state = cpufreq_device->cpufreq_state; >> + return 0; >> +} >> + >> +/*This cooling may be as PASSIVE/STATE_ACTIVE type*/ >> +static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev, >> + unsigned long state) >> +{ >> + cpufreq_apply_cooling(state); >> + return 0; >> +} >> + >> +/* bind cpufreq callbacks to cpufreq cooling device */ >> +static struct thermal_cooling_device_ops cpufreq_cooling_ops = { >> + .get_max_state = cpufreq_get_max_state, >> + .get_cur_state = cpufreq_get_cur_state, >> + .set_cur_state = cpufreq_set_cur_state, >> +}; >> + >> +static struct notifier_block thermal_cpufreq_notifier_block = { >> + .notifier_call = thermal_cpufreq_notifier, >> +}; >> + >> +struct thermal_cooling_device *cpufreq_cooling_register( >> + struct freq_pctg_table *tab_ptr, unsigned int tab_size, >> + const struct cpumask *mask_val) >> +{ >> + struct thermal_cooling_device *cool_dev; >> + >> + if (tab_ptr == NULL || tab_size == 0) >> + return ERR_PTR(-EINVAL); >> + >> + cpufreq_device = >> + kzalloc(sizeof(struct cpufreq_cooling_device), GFP_KERNEL); >> + >> + if (!cpufreq_device) >> + return ERR_PTR(-ENOMEM); >> + >> + cool_dev = thermal_cooling_device_register("thermal-cpufreq", NULL, >> + &cpufreq_cooling_ops); >> + if (!cool_dev) { >> + kfree(cpufreq_device); >> + return ERR_PTR(-EINVAL); >> + } >> + >> + cpufreq_device->tab_ptr = tab_ptr; >> + cpufreq_device->tab_size = tab_size; >> + cpufreq_device->cool_dev = cool_dev; >> + cpufreq_device->allowed_cpus = mask_val; >> + >> + cpufreq_register_notifier(&thermal_cpufreq_notifier_block, >> + CPUFREQ_POLICY_NOTIFIER); >> + return cool_dev; >> +} >> +EXPORT_SYMBOL(cpufreq_cooling_register); >> + >> +void cpufreq_cooling_unregister(void) >> +{ >> + cpufreq_unregister_notifier(&thermal_cpufreq_notifier_block, >> + CPUFREQ_POLICY_NOTIFIER); >> + thermal_cooling_device_unregister(cpufreq_device->cool_dev); >> + kfree(cpufreq_device); >> +} >> +EXPORT_SYMBOL(cpufreq_cooling_unregister); >> +#else /*!CONFIG_CPU_FREQ*/ >> +struct thermal_cooling_device *cpufreq_cooling_register( >> + struct freq_pctg_table *tab_ptr, unsigned int tab_size) >> +{ >> + return NULL; >> +} >> +EXPORT_SYMBOL(cpufreq_cooling_register); >> +void cpufreq_cooling_unregister(void) >> +{ >> + return; >> +} >> +EXPORT_SYMBOL(cpufreq_cooling_unregister); >> +#endif /*CONFIG_CPU_FREQ*/ >> + >> +#ifdef CONFIG_HOTPLUG_CPU >> + >> +struct hotplug_cooling_device { >> + struct thermal_cooling_device *cool_dev; >> + unsigned int hotplug_state; >> + const struct cpumask *allowed_cpus; >> +}; >> +static struct hotplug_cooling_device *hotplug_device; >> + >> +/* >> + * cpu hotplug cooling device callback functions >> + */ >> +static int cpuhotplug_get_max_state(struct thermal_cooling_device *cdev, >> + unsigned long *state) >> +{ >> + *state = 1; >> + return 0; >> +} >> + >> +static int cpuhotplug_get_cur_state(struct thermal_cooling_device *cdev, >> + unsigned long *state) >> +{ >> + /*This cooling device may be of type ACTIVE, so state field >> + can be 0 or 1*/ >> + *state = hotplug_device->hotplug_state; >> + return 0; >> +} >> + >> +/*This cooling may be as PASSIVE/STATE_ACTIVE type*/ >> +static int cpuhotplug_set_cur_state(struct thermal_cooling_device *cdev, >> + unsigned long state) >> +{ >> + int cpuid, this_cpu = smp_processor_id(); >> + >> + if (hotplug_device->hotplug_state == state) >> + return 0; >> + >> + /*This cooling device may be of type ACTIVE, so state field >> + can be 0 or 1*/ > > /* > * Small thing. Check the style for multi line comments. > * This is also an example. > */ Ok. > >> + if (state == 1) { >> + for_each_cpu(cpuid, hotplug_device->allowed_cpus) { >> + if (cpu_online(cpuid) && (cpuid != this_cpu)) >> + cpu_down(cpuid); >> + } >> + } else if (state == 0) { >> + for_each_cpu(cpuid, hotplug_device->allowed_cpus) { >> + if (!cpu_online(cpuid) && (cpuid != this_cpu)) >> + cpu_up(cpuid); >> + } >> + } else >> + return -EINVAL; >> + >> + hotplug_device->hotplug_state = state; >> + >> + return 0; >> +} >> +/* bind hotplug callbacks to cpu hotplug cooling device */ >> +static struct thermal_cooling_device_ops cpuhotplug_cooling_ops = { >> + .get_max_state = cpuhotplug_get_max_state, >> + .get_cur_state = cpuhotplug_get_cur_state, >> + .set_cur_state = cpuhotplug_set_cur_state, >> +}; >> + >> +struct thermal_cooling_device *cpuhotplug_cooling_register( >> + const struct cpumask *mask_val) >> +{ >> + struct thermal_cooling_device *cool_dev; >> + >> + hotplug_device = >> + kzalloc(sizeof(struct hotplug_cooling_device), GFP_KERNEL); >> + >> + if (!hotplug_device) >> + return ERR_PTR(-ENOMEM); >> + >> + cool_dev = thermal_cooling_device_register("thermal-cpuhotplug", NULL, >> + &cpuhotplug_cooling_ops); >> + if (!cool_dev) { >> + kfree(hotplug_device); >> + return ERR_PTR(-EINVAL); >> + } >> + >> + hotplug_device->cool_dev = cool_dev; >> + hotplug_device->hotplug_state = 0; >> + hotplug_device->allowed_cpus = mask_val; >> + >> + return cool_dev; >> +} >> +EXPORT_SYMBOL(cpuhotplug_cooling_register); >> + >> +void cpuhotplug_cooling_unregister(void) >> +{ >> + thermal_cooling_device_unregister(hotplug_device->cool_dev); >> + kfree(hotplug_device); >> +} >> +EXPORT_SYMBOL(cpuhotplug_cooling_unregister); >> +#else /*!CONFIG_HOTPLUG_CPU*/ >> +struct thermal_cooling_device *cpuhotplug_cooling_register( >> + const struct cpumask *mask_val) >> +{ >> + return NULL; >> +} >> +EXPORT_SYMBOL(cpuhotplug_cooling_register); >> +void cpuhotplug_cooling_unregister(void) >> +{ >> + return; >> +} >> +EXPORT_SYMBOL(cpuhotplug_cooling_unregister); >> +#endif /*CONFIG_HOTPLUG_CPU*/ >> diff --git a/include/linux/cpu_cooling.h b/include/linux/cpu_cooling.h >> new file mode 100644 >> index 0000000..0c57375 >> --- /dev/null >> +++ b/include/linux/cpu_cooling.h >> @@ -0,0 +1,45 @@ >> +/* >> + * linux/include/linux/cpu_cooling.h >> + * >> + * Copyright (C) 2011 Samsung Electronics Co., Ltd(http://www.samsung.com) >> + * Copyright (C) 2011 Amit Daniel <amit.kachhap@xxxxxxxxxx> >> + * >> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> + * This program is free software; you can redistribute it and/or modify >> + * it under the terms of the GNU General Public License as published by >> + * the Free Software Foundation; version 2 of the License. >> + * >> + * This program is distributed in the hope that it will be useful, but >> + * WITHOUT ANY WARRANTY; without even the implied warranty of >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + * General Public License for more details. >> + * >> + * You should have received a copy of the GNU General Public License along >> + * with this program; if not, write to the Free Software Foundation, Inc., >> + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. >> + * >> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> + */ >> + >> +#ifndef __CPU_COOLING_H__ >> +#define __CPU_COOLING_H__ >> + >> +#include <linux/thermal.h> >> + >> +struct freq_pctg_table { >> + unsigned int freq_clip_pctg[NR_CPUS]; >> + unsigned int polling_interval; >> +}; >> + >> +extern struct thermal_cooling_device *cpufreq_cooling_register( >> + struct freq_pctg_table *tab_ptr, unsigned int tab_size, >> + const struct cpumask *mask_val); > > I suppose your original idea was to have this function called only once right? > I'd suggest to add some documentation on this header file to specify this. > > Besides, the fact that you receive a cpumask may suggest you could register > different cooling device for subsets of your system cpus. > > In any case, if you call this function more than once, you may end with memory leaks > and the last call will be the one acting in the system. > yes these api's are meant to be called once. I am working on a new version where all these function will be multi-instance and hence different cpus can support different frequency clipping. > > >> + >> +extern void cpufreq_cooling_unregister(void); >> + >> +extern struct thermal_cooling_device *cpuhotplug_cooling_register( >> + const struct cpumask *mask_val); >> + >> +extern void cpuhotplug_cooling_unregister(void); >> + >> +#endif /* __CPU_COOLING_H__ */ > > Another thing, I have the impression that solving the config selection in the > header file instead of in .c files is cleaner. If you want to solve the config > selection in .c files, better to do so with Makefiles. > > If you keep the ifdefery ( THERMAL vs. HOTPLUG_CPU vs. CPU_FREQ ) here in the > header file, it is easier to read. Ok will move them in .h > >> -- >> 1.7.1 >> >> _______________________________________________ >> linux-pm mailing list >> linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx >> https://lists.linuxfoundation.org/mailman/listinfo/linux-pm _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/linux-pm