Hi Amit, > -----Original Message----- > From: amit kachhap [mailto:amitdanielk@xxxxxxxxx] On Behalf Of Amit Daniel > Kachhap > Sent: Wednesday, February 22, 2012 3:44 PM > To: linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx > Cc: linux-kernel@xxxxxxxxxxxxxxx; mjg59@xxxxxxxxxxxxx; linux- > acpi@xxxxxxxxxxxxxxx; lenb@xxxxxxxxxx; linaro-dev@xxxxxxxxxxxxxxxx; > amit.kachhap@xxxxxxxxxx; R, Durgadoss; rob.lee@xxxxxxxxxx; patches@xxxxxxxxxx > Subject: [PATCH 2/4] thermal: Add generic cpufreq cooling implementation > > This patch adds support for generic cpu thermal cooling low level > implementations using frequency scaling up/down based on the request > from user. Different cpu related cooling devices can be registered by the I believe what you mean by 'user' is another Driver using this code.. right ?? > user and the binding of these cooling devices to the corresponding > trip points can be easily done as the registration API's return the > cooling device pointer. The user of these api's are responsible for > passing clipping frequency in percentages. > > Signed-off-by: Amit Daniel Kachhap <amit.kachhap@xxxxxxxxxx> > --- > Documentation/thermal/cpu-cooling-api.txt | 40 ++++ > drivers/thermal/Kconfig | 11 + > drivers/thermal/Makefile | 1 + > drivers/thermal/cpu_cooling.c | 310 +++++++++++++++++++++++++++++ > include/linux/cpu_cooling.h | 54 +++++ > 5 files changed, 416 insertions(+), 0 deletions(-) > create mode 100644 Documentation/thermal/cpu-cooling-api.txt > create mode 100644 drivers/thermal/cpu_cooling.c > create mode 100644 include/linux/cpu_cooling.h > > diff --git a/Documentation/thermal/cpu-cooling-api.txt > b/Documentation/thermal/cpu-cooling-api.txt > new file mode 100644 > index 0000000..04de67c > --- /dev/null > +++ b/Documentation/thermal/cpu-cooling-api.txt > @@ -0,0 +1,40 @@ > +CPU cooling api's How To > +=================================== > + > +Written by Amit Daniel Kachhap <amit.kachhap@xxxxxxxxxx> > + > +Updated: 13 Dec 2011 > + > +Copyright (c) 2011 Samsung Electronics Co., Ltd(http://www.samsung.com) > + > +0. Introduction > + > +The generic cpu cooling(freq clipping, cpuhotplug) provides > +registration/unregistration api's to the user. The binding of the cooling > +devices to the trip point is left for the user. The registration api's returns > +the cooling device pointer. > + > +1. cpufreq cooling api's > + > +1.1 cpufreq registration api's > +1.1.1 struct thermal_cooling_device *cpufreq_cooling_register( > + struct freq_pctg_table *tab_ptr, unsigned int tab_size, > + const struct cpumask *mask_val) > + > + This interface function registers the cpufreq cooling device with the name > + "thermal-cpufreq-%x". This api can support multiple instance of cpufreq > cooling > + devices. > + > + tab_ptr: The table containing the percentage of frequency to be clipped > for > + each cooling state. > + .freq_clip_pctg: Percentage of frequency to be clipped for each allowed > + cpus. > + .polling_interval: polling interval for this cooling state. > + tab_size: the total number of cooling state. Although I can understand that the table size is equal to the total number of cooling states, it is better to name it 'num_cooling_states' (or something) that means what it is.. > + mask_val: all the allowed cpu's where frequency clipping can happen. > + > +1.1.2 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev) > + > + This interface function unregisters the "thermal-cpufreq-%x" cooling > device. > + > + cdev: Cooling device pointer which has to be unregistered. > diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig > index f7f71b2..298c1cd 100644 > --- a/drivers/thermal/Kconfig > +++ b/drivers/thermal/Kconfig > @@ -18,3 +18,14 @@ config THERMAL_HWMON > depends on THERMAL > depends on HWMON=y || HWMON=THERMAL > default y > + > +config CPU_THERMAL > + bool "generic cpu cooling support" > + depends on THERMAL > + help > + This implements the generic cpu cooling mechanism through frequency > + reduction, cpu hotplug and any other ways of reducing temperature. An > + ACPI version of this already exists(drivers/acpi/processor_thermal.c). > + This will be useful for platforms using the generic thermal interface > + and not the ACPI interface. > + If you want this support, you should say Y or M here. > diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile > index 31108a0..655cbc4 100644 > --- a/drivers/thermal/Makefile > +++ b/drivers/thermal/Makefile > @@ -3,3 +3,4 @@ > # > > obj-$(CONFIG_THERMAL) += thermal_sys.o > +obj-$(CONFIG_CPU_THERMAL) += cpu_cooling.o > diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c > new file mode 100644 > index 0000000..298f550 > --- /dev/null > +++ b/drivers/thermal/cpu_cooling.c > @@ -0,0 +1,310 @@ > +/* > + * linux/drivers/thermal/cpu_cooling.c > + * > + * Copyright (C) 2011 Samsung Electronics Co., Ltd(http://www.samsung.com) > + * Copyright (C) 2011 Amit Daniel <amit.kachhap@xxxxxxxxxx> > + * > + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; version 2 of the License. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License along > + * with this program; if not, write to the Free Software Foundation, Inc., > + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. > + * > + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + */ > +#include <linux/kernel.h> > +#include <linux/module.h> > +#include <linux/thermal.h> > +#include <linux/platform_device.h> > +#include <linux/cpufreq.h> > +#include <linux/err.h> > +#include <linux/slab.h> > +#include <linux/cpu.h> > +#include <linux/cpu_cooling.h> > + > +#ifdef CONFIG_CPU_FREQ Except the _idr methods, all the code is inside this #ifdef. So, I think it is better to add this dependency in Kconfig, and leave this code clean w/o many #ifdef's. > +struct cpufreq_cooling_device { > + int id; > + struct thermal_cooling_device *cool_dev; > + struct freq_pctg_table *tab_ptr; > + unsigned int tab_size; > + unsigned int cpufreq_state; > + const struct cpumask *allowed_cpus; > + struct list_head node; > +}; > + > +static LIST_HEAD(cooling_cpufreq_list); > +static DEFINE_MUTEX(cooling_cpufreq_lock); > +static DEFINE_IDR(cpufreq_idr); > +static struct cpufreq_cooling_device *notify_cpufreq; Please move this after the DEFINE_PER_CPU. Hard to notice here.. > +static DEFINE_PER_CPU(unsigned int, max_policy_freq); > +#endif /*CONFIG_CPU_FREQ*/ > + > +static int get_idr(struct idr *idr, struct mutex *lock, int *id) > +{ > + int err; > +again: > + if (unlikely(idr_pre_get(idr, GFP_KERNEL) == 0)) > + return -ENOMEM; > + > + if (lock) > + mutex_lock(lock); > + err = idr_get_new(idr, NULL, id); > + if (lock) > + mutex_unlock(lock); > + if (unlikely(err == -EAGAIN)) > + goto again; > + else if (unlikely(err)) > + return err; > + > + *id = *id & MAX_ID_MASK; > + return 0; > +} > + > +static void release_idr(struct idr *idr, struct mutex *lock, int id) > +{ > + if (lock) > + mutex_lock(lock); > + idr_remove(idr, id); > + if (lock) > + mutex_unlock(lock); > +} > + > +#ifdef CONFIG_CPU_FREQ > +/*Below codes defines functions to be used for cpufreq as cooling device*/ > +static bool is_cpufreq_valid(int cpu) > +{ > + struct cpufreq_policy policy; > + if (!cpufreq_get_policy(&policy, cpu)) > + return true; > + return false; Why not just do a return !cpufreq_get_policy(&policy, cpu); > +} > + > +static int cpufreq_apply_cooling(struct cpufreq_cooling_device > *cpufreq_device, > + unsigned long cooling_state) > +{ > + int cpuid, this_cpu = smp_processor_id(); > + > + if (!is_cpufreq_valid(this_cpu)) You are not using this_cpu anywhere else..so, directly use Smp_processor_id() here.. > + return 0; > + > + if (cooling_state > cpufreq_device->tab_size) > + return -EINVAL; > + > + /*Check if last cooling level is same as current cooling level*/ Use either 'state' or 'level' in comments as well as the variable name Makes it easy to read.. > + if (cpufreq_device->cpufreq_state == cooling_state) > + return 0; > + > + cpufreq_device->cpufreq_state = cooling_state; > + > + /*cpufreq thermal notifier uses this cpufreq device pointer*/ > + notify_cpufreq = cpufreq_device; > + > + for_each_cpu(cpuid, cpufreq_device->allowed_cpus) { > + if (is_cpufreq_valid(cpuid)) > + cpufreq_update_policy(cpuid); > + } > + > + return 0; > +} > + > +static int thermal_cpufreq_notifier(struct notifier_block *nb, > + unsigned long event, void *data) > +{ > + struct cpufreq_policy *policy = data; > + struct freq_pctg_table *th_table; > + unsigned long max_freq = 0; > + unsigned int th_pctg = 0, level; > + > + if (event != CPUFREQ_ADJUST) > + return 0; > + > + if (!notify_cpufreq) > + return 0; Why not combine both if's with an || ? > + > + level = notify_cpufreq->cpufreq_state; Yes..here it is..please use level/state.. > + > + if (level > 0) { > + th_table = > + &(notify_cpufreq->tab_ptr[level - 1]); > + th_pctg = th_table->freq_clip_pctg; > + max_freq = > + (policy->cpuinfo.max_freq * (100 - th_pctg)) / 100; > + > + if (per_cpu(max_policy_freq, policy->cpu) == 0) > + per_cpu(max_policy_freq, policy->cpu) = policy->max; > + } else { > + if (per_cpu(max_policy_freq, policy->cpu) != 0) { > + max_freq = per_cpu(max_policy_freq, policy->cpu); > + per_cpu(max_policy_freq, policy->cpu) = 0; > + } else { > + max_freq = policy->max; > + } > + } > + > + cpufreq_verify_within_limits(policy, 0, max_freq); > + > + return 0; > +} > + > +/* > + * cpufreq cooling device callback functions > + */ > +static int cpufreq_get_max_state(struct thermal_cooling_device *cdev, > + unsigned long *state) > +{ > + struct cpufreq_cooling_device *cpufreq_device = NULL; Why assigning NULL ? > + > + mutex_lock(&cooling_cpufreq_lock); > + list_for_each_entry(cpufreq_device, &cooling_cpufreq_list, node) > + if (cpufreq_device && cpufreq_device->cool_dev == cdev) > + break; > + > + mutex_unlock(&cooling_cpufreq_lock); > + if (!cpufreq_device || cpufreq_device->cool_dev != cdev) > + return -EINVAL; > + > + *state = cpufreq_device->tab_size; > + return 0; > +} The above can be simplified this way: int ret = -EINVAL; mutex_lock(&cooling_cpufreq_lock); list_for_each_entry(cpufreq_device, &cooling_cpufreq_list, node) if (cpufreq_device->cool_dev == cdev) { *state = cpufreq_device->tab_size; ret = 0; break; } mutex_unlock(&cooling_cpufreq_lock); return ret; I think the same can be done for the get_ function below..and similar ones in the patch 3/4. > + > +static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev, > + unsigned long *state) > +{ > + struct cpufreq_cooling_device *cpufreq_device = NULL; > + > + mutex_lock(&cooling_cpufreq_lock); > + list_for_each_entry(cpufreq_device, &cooling_cpufreq_list, node) > + if (cpufreq_device && cpufreq_device->cool_dev == cdev) > + break; > + > + mutex_unlock(&cooling_cpufreq_lock); > + if (!cpufreq_device || cpufreq_device->cool_dev != cdev) > + return -EINVAL; > + *state = cpufreq_device->cpufreq_state; > + return 0; > +} > + > +/*This cooling may be as PASSIVE/ACTIVE type*/ > +static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev, > + unsigned long state) > +{ > + struct cpufreq_cooling_device *cpufreq_device = NULL; > + > + mutex_lock(&cooling_cpufreq_lock); > + list_for_each_entry(cpufreq_device, &cooling_cpufreq_list, node) > + if (cpufreq_device && cpufreq_device->cool_dev == cdev) > + break; > + > + mutex_unlock(&cooling_cpufreq_lock); > + if (!cpufreq_device || cpufreq_device->cool_dev != cdev) > + return -EINVAL; > + > + cpufreq_apply_cooling(cpufreq_device, state); > + return 0; > +} > + > +/* bind cpufreq callbacks to cpufreq cooling device */ > +static struct thermal_cooling_device_ops cpufreq_cooling_ops = { > + .get_max_state = cpufreq_get_max_state, > + .get_cur_state = cpufreq_get_cur_state, > + .set_cur_state = cpufreq_set_cur_state, > +}; > + > +static struct notifier_block thermal_cpufreq_notifier_block = { > + .notifier_call = thermal_cpufreq_notifier, > +}; > + > +struct thermal_cooling_device *cpufreq_cooling_register( > + struct freq_pctg_table *tab_ptr, unsigned int tab_size, > + const struct cpumask *mask_val) > +{ > + struct thermal_cooling_device *cool_dev; > + struct cpufreq_cooling_device *cpufreq_dev = NULL; > + unsigned int count = 0; > + char dev_name[THERMAL_NAME_LENGTH]; > + int ret = 0, id = 0; > + > + if (tab_ptr == NULL || tab_size == 0) > + return ERR_PTR(-EINVAL); > + > + list_for_each_entry(cpufreq_dev, &cooling_cpufreq_list, node) > + count++; > + > + cpufreq_dev = > + kzalloc(sizeof(struct cpufreq_cooling_device), GFP_KERNEL); > + > + if (!cpufreq_dev) > + return ERR_PTR(-ENOMEM); > + > + cpufreq_dev->tab_ptr = tab_ptr; > + cpufreq_dev->tab_size = tab_size; > + cpufreq_dev->allowed_cpus = mask_val; > + > + ret = get_idr(&cpufreq_idr, &cooling_cpufreq_lock, &cpufreq_dev->id); > + if (ret) { > + kfree(cpufreq_dev); > + return ERR_PTR(-EINVAL); > + } > + > + sprintf(dev_name, "thermal-cpufreq-%d", cpufreq_dev->id); > + > + cool_dev = thermal_cooling_device_register(dev_name, cpufreq_dev, > + &cpufreq_cooling_ops); > + if (!cool_dev) { > + release_idr(&cpufreq_idr, &cooling_cpufreq_lock, > + cpufreq_dev->id); > + kfree(cpufreq_dev); > + return ERR_PTR(-EINVAL); > + } > + cpufreq_dev->id = id; > + cpufreq_dev->cool_dev = cool_dev; > + mutex_lock(&cooling_cpufreq_lock); > + list_add_tail(&cpufreq_dev->node, &cooling_cpufreq_list); > + mutex_unlock(&cooling_cpufreq_lock); > + > + if (count == 0) > + cpufreq_register_notifier(&thermal_cpufreq_notifier_block, > + CPUFREQ_POLICY_NOTIFIER); Why do we register only when count is 0 ? Or should this be 'if (count > 0)' ? > + return cool_dev; > +} > +EXPORT_SYMBOL(cpufreq_cooling_register); > + > +void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev) > +{ > + struct cpufreq_cooling_device *cpufreq_dev = NULL; > + unsigned int count = 0; > + > + mutex_lock(&cooling_cpufreq_lock); > + list_for_each_entry(cpufreq_dev, &cooling_cpufreq_list, node) { > + if (cpufreq_dev && cpufreq_dev->cool_dev == cdev) > + break; > + count++; > + } > + > + if (!cpufreq_dev || cpufreq_dev->cool_dev != cdev) { > + mutex_unlock(&cooling_cpufreq_lock); > + return; > + } > + > + list_del(&cpufreq_dev->node); > + mutex_unlock(&cooling_cpufreq_lock); > + > + if (count == 1) Same here..I do not get the idea behind this.. Shouldn't this be 'if (count > 0)' ? In general, I would like to see a real driver using these API's. This will help everybody to understand the working of these API's much better. Thanks, Durga > + cpufreq_unregister_notifier(&thermal_cpufreq_notifier_block, > + CPUFREQ_POLICY_NOTIFIER); > + > + thermal_cooling_device_unregister(cpufreq_dev->cool_dev); > + release_idr(&cpufreq_idr, &cooling_cpufreq_lock, cpufreq_dev->id); > + kfree(cpufreq_dev); > +} > +EXPORT_SYMBOL(cpufreq_cooling_unregister); > +#endif /*CONFIG_CPU_FREQ*/ > diff --git a/include/linux/cpu_cooling.h b/include/linux/cpu_cooling.h > new file mode 100644 > index 0000000..5dc5632 > --- /dev/null > +++ b/include/linux/cpu_cooling.h > @@ -0,0 +1,54 @@ > +/* > + * linux/include/linux/cpu_cooling.h > + * > + * Copyright (C) 2011 Samsung Electronics Co., Ltd(http://www.samsung.com) > + * Copyright (C) 2011 Amit Daniel <amit.kachhap@xxxxxxxxxx> > + * > + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; version 2 of the License. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License along > + * with this program; if not, write to the Free Software Foundation, Inc., > + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. > + * > + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + */ > + > +#ifndef __CPU_COOLING_H__ > +#define __CPU_COOLING_H__ > + > +#include <linux/thermal.h> > + > +struct freq_pctg_table { > + unsigned int freq_clip_pctg; > + unsigned int polling_interval; > +}; > + > +#ifdef CONFIG_CPU_FREQ > +struct thermal_cooling_device *cpufreq_cooling_register( > + struct freq_pctg_table *tab_ptr, unsigned int tab_size, > + const struct cpumask *mask_val); > + > +void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev); > +#else /*!CONFIG_CPU_FREQ*/ > +static inline struct thermal_cooling_device *cpufreq_cooling_register( > + struct freq_pctg_table *tab_ptr, unsigned int tab_size, > + const struct cpumask *mask_val) > +{ > + return NULL; > +} > +static inline void cpufreq_cooling_unregister( > + struct thermal_cooling_device *cdev) > +{ > + return; > +} > +#endif /*CONFIG_CPU_FREQ*/ > + > +#endif /* __CPU_COOLING_H__ */ > -- > 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html