>-----Original Message----- >From: Thomas Renninger [mailto:trenn@xxxxxxx] >Sent: Friday, July 03, 2009 4:41 AM >To: Pallipadi, Venkatesh >Cc: Dave Jones; linux-kernel@xxxxxxxxxxxxxxx; >cpufreq@xxxxxxxxxxxxxxx; kernel-testers@xxxxxxxxxxxxxxx; Ingo >Molnar; Rafael J. Wysocki; Dave Young; Pekka Enberg; Mathieu Desnoyers >Subject: Re: [patch 1/4] cpufreq: Eliminate the recent lockdep >warnings in cpufreq > >On Friday 03 July 2009 02:08:30 venkatesh.pallipadi@xxxxxxxxx wrote: >> Commit b14893a62c73af0eca414cfed505b8c09efc613c although it was very >> much needed to properly cleanup ondemand timer, opened-up a >can of worms >> related to locking dependencies in cpufreq. >> >> Patch here defines the need for dbs_mutex and cleans up its usage in >> ondemand governor. This also resolves the lockdep warnings >reported here >I think I get these changes and now dbs_mutex is needed... >Making sure governor() is always called with rwsem held (hope >that is the >case now) is a good idea. Unfortunately it requires the dbs_mutex in >do_dbs_timer and it will be hard to ever remove it. Yes. Rwsem held for any policy changes from cpufreq core makes this fix clean. I did not like the earlier state where rwsem was helt for START and not for STOP calls etc. Patch 3 and 4 removes dbs_mutex from do_dbs_timer. >I still do not see the need of "dbs_mutex protects data in >dbs_tuners_ins >from concurrent changes", though. If someone enlightens me, that would >be appreciated. OK. Consider these two happening in parallel. echo 0 > /sys/devices/system/cpu/cpu0/cpufreq/ondemand/ignore_nice echo 1 > /sys/devices/system/cpu/cpu4/cpufreq/ondemand/ignore_nice As they are coming from different cpu, rwsem wont protect us and without the dbs_mutex, end state after this can will be unpredictable. prev_cpu_idle and prev_cpu_nice can end up with wrong values where only one of them is set etc. That will affect the ondemand algorithm. Thanks, Venki >> http://lkml.indiana.edu/hypermail/linux/kernel/0906.1/01925.html >> http://lkml.indiana.edu/hypermail/linux/kernel/0907.0/00820.html >> >> and few others.. >> >> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@xxxxxxxxx> >> --- >> drivers/cpufreq/cpufreq.c | 4 ++-- >> drivers/cpufreq/cpufreq_conservative.c | 27 >+++++++++++---------------- >> drivers/cpufreq/cpufreq_ondemand.c | 27 >+++++++++++---------------- >> 3 files changed, 24 insertions(+), 34 deletions(-) >> >> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c >> index 6e2ec0b..c7fe16e 100644 >> --- a/drivers/cpufreq/cpufreq.c >> +++ b/drivers/cpufreq/cpufreq.c >> @@ -1070,8 +1070,6 @@ static int __cpufreq_remove_dev(struct >sys_device >*sys_dev) >> spin_unlock_irqrestore(&cpufreq_driver_lock, flags); >> #endif >> >> - unlock_policy_rwsem_write(cpu); >> - >> if (cpufreq_driver->target) >> __cpufreq_governor(data, CPUFREQ_GOV_STOP); >> >> @@ -1088,6 +1086,8 @@ static int __cpufreq_remove_dev(struct >sys_device >*sys_dev) >> if (cpufreq_driver->exit) >> cpufreq_driver->exit(data); >> >> + unlock_policy_rwsem_write(cpu); >> + >> free_cpumask_var(data->related_cpus); >> free_cpumask_var(data->cpus); >> kfree(data); >> diff --git a/drivers/cpufreq/cpufreq_conservative.c >b/drivers/cpufreq/cpufreq_conservative.c >> index 7fc58af..58889f2 100644 >> --- a/drivers/cpufreq/cpufreq_conservative.c >> +++ b/drivers/cpufreq/cpufreq_conservative.c >> @@ -70,15 +70,10 @@ static DEFINE_PER_CPU(struct cpu_dbs_info_s, >cpu_dbs_info); >> static unsigned int dbs_enable; /* number of CPUs using >this policy */ >> >> /* >> - * DEADLOCK ALERT! There is a ordering requirement between >cpu_hotplug >> - * lock and dbs_mutex. cpu_hotplug lock should always be held before >> - * dbs_mutex. If any function that can potentially take >cpu_hotplug lock >> - * (like __cpufreq_driver_target()) is being called with >dbs_mutex taken, >then >> - * cpu_hotplug lock should be taken before that. Note that >cpu_hotplug lock >> - * is recursive for the same process. -Venki >> - * DEADLOCK ALERT! (2) : do_dbs_timer() must not take the >dbs_mutex, >because it >> - * would deadlock with cancel_delayed_work_sync(), which is >needed for >proper >> - * raceless workqueue teardown. >> + * dbs_mutex protects data in dbs_tuners_ins from >concurrent changes on >> + * different CPUs. It protects dbs_enable in governor >start/stop. It also >> + * serializes governor limit_change with do_dbs_timer. We >do not want >> + * do_dbs_timer to run when user is changing the governor or limits. >> */ >> static DEFINE_MUTEX(dbs_mutex); >> >> @@ -488,18 +483,17 @@ static void do_dbs_timer(struct >work_struct *work) >> >> delay -= jiffies % delay; >> >> - if (lock_policy_rwsem_write(cpu) < 0) >> - return; >> + mutex_lock(&dbs_mutex); >> >> if (!dbs_info->enable) { >> - unlock_policy_rwsem_write(cpu); >> + mutex_unlock(&dbs_mutex); >> return; >> } >> >> dbs_check_cpu(dbs_info); >> >> queue_delayed_work_on(cpu, kconservative_wq, >&dbs_info->work, delay); >> - unlock_policy_rwsem_write(cpu); >> + mutex_unlock(&dbs_mutex); >> } >> >> static inline void dbs_timer_init(struct cpu_dbs_info_s *dbs_info) >> @@ -590,15 +584,16 @@ static int cpufreq_governor_dbs(struct >cpufreq_policy >*policy, >> &dbs_cpufreq_notifier_block, >> CPUFREQ_TRANSITION_NOTIFIER); >> } >> - dbs_timer_init(this_dbs_info); >> - >> mutex_unlock(&dbs_mutex); >> >> + dbs_timer_init(this_dbs_info); >> + >> break; >> >> case CPUFREQ_GOV_STOP: >> - mutex_lock(&dbs_mutex); >> dbs_timer_exit(this_dbs_info); >> + >> + mutex_lock(&dbs_mutex); >> sysfs_remove_group(&policy->kobj, &dbs_attr_group); >> dbs_enable--; >> >> diff --git a/drivers/cpufreq/cpufreq_ondemand.c >b/drivers/cpufreq/cpufreq_ondemand.c >> index 1911d17..246ae14 100644 >> --- a/drivers/cpufreq/cpufreq_ondemand.c >> +++ b/drivers/cpufreq/cpufreq_ondemand.c >> @@ -78,15 +78,10 @@ static DEFINE_PER_CPU(struct cpu_dbs_info_s, >cpu_dbs_info); >> static unsigned int dbs_enable; /* number of CPUs using >this policy */ >> >> /* >> - * DEADLOCK ALERT! There is a ordering requirement between >cpu_hotplug >> - * lock and dbs_mutex. cpu_hotplug lock should always be held before >> - * dbs_mutex. If any function that can potentially take >cpu_hotplug lock >> - * (like __cpufreq_driver_target()) is being called with >dbs_mutex taken, >then >> - * cpu_hotplug lock should be taken before that. Note that >cpu_hotplug lock >> - * is recursive for the same process. -Venki >> - * DEADLOCK ALERT! (2) : do_dbs_timer() must not take the >dbs_mutex, >because it >> - * would deadlock with cancel_delayed_work_sync(), which is >needed for >proper >> - * raceless workqueue teardown. >> + * dbs_mutex protects data in dbs_tuners_ins from >concurrent changes on >> + * different CPUs. It protects dbs_enable in governor >start/stop. It also >> + * serializes governor limit_change with do_dbs_timer. We >do not want >> + * do_dbs_timer to run when user is changing the governor or limits. >> */ >> static DEFINE_MUTEX(dbs_mutex); >> >> @@ -494,11 +489,10 @@ static void do_dbs_timer(struct >work_struct *work) >> >> delay -= jiffies % delay; >> >> - if (lock_policy_rwsem_write(cpu) < 0) >> - return; >> + mutex_lock(&dbs_mutex); >> >> if (!dbs_info->enable) { >> - unlock_policy_rwsem_write(cpu); >> + mutex_unlock(&dbs_mutex); >> return; >> } >> >> @@ -517,7 +511,7 @@ static void do_dbs_timer(struct >work_struct *work) >> dbs_info->freq_lo, CPUFREQ_RELATION_H); >> } >> queue_delayed_work_on(cpu, kondemand_wq, >&dbs_info->work, delay); >> - unlock_policy_rwsem_write(cpu); >> + mutex_unlock(&dbs_mutex); >> } >> >> static inline void dbs_timer_init(struct cpu_dbs_info_s *dbs_info) >> @@ -598,14 +592,15 @@ static int cpufreq_governor_dbs(struct >cpufreq_policy >*policy, >> max(min_sampling_rate, >> latency * LATENCY_MULTIPLIER); >> } >> - dbs_timer_init(this_dbs_info); >> - >> mutex_unlock(&dbs_mutex); >> + >> + dbs_timer_init(this_dbs_info); >> break; >> >> case CPUFREQ_GOV_STOP: >> - mutex_lock(&dbs_mutex); >> dbs_timer_exit(this_dbs_info); >> + >> + mutex_lock(&dbs_mutex); >> sysfs_remove_group(&policy->kobj, &dbs_attr_group); >> dbs_enable--; >> mutex_unlock(&dbs_mutex); >> -- >> 1.6.0.6 >> >> -- >> >> > > >-- To unsubscribe from this list: send the line "unsubscribe kernel-testers" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html