On 04/29/2014 10:21 AM, Viresh Kumar wrote: > Nice effort. > Thanks! :-) > On 29 April 2014 00:25, Srivatsa S. Bhat > <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> wrote: >> Now all such drivers have been fixed, but debugging this issue was not >> very straight-forward (even lockdep didn't catch this). So let us add a >> debug infrastructure to the cpufreq core to catch such issues more easily >> in the future. > > BUT, I am not sure if we really need it :( > > I think we just got into the 'barrier' stuff as we had some doubts about it > earlier and were quite sure that nothing else could go wrong. Otherwise > the only problem could have been present was the second queuing > from the same thread. And we will surely get stuck if that happens and > we can't just miss that error.. > Yeah, and we _did_ hit that hang, but it was not at all intuitive at first as to what was going wrong. Worse, even lockdep is not in a position to catch such scenarios. So it definitely doesn't hurt to add a small infrastructure to catch such issues in the future, IMHO. Besides, if we can add features for users, surely we can also add some non-intrusive debug code for ourselves too, to make our lives easier, right? :-) I'm sure we deserve that privilege ;-) Regards, Srivatsa S. Bhat >> Scenario 1: (Deadlock-free) >> ---------- >> >> Task A Task B >> >> /* 1st freq transition */ >> Invoke _begin() { >> ... >> ... >> } >> >> Change the frequency >> >> Got interrupt for successful >> change of frequency. >> >> /* 1st freq transition */ >> Invoke _end() { >> ... >> ... >> /* 2nd freq transition */ ... >> Invoke _begin() { ... >> ... //waiting for B ... >> ... //to finish _end() } >> ... >> ... >> } >> >> >> This scenario is actually deadlock-free because Task A can wait inside the >> second call to _begin() without self-deadlocking, because it is the >> responsibility of Task B to finish the first sequence by invoking the >> corresponding _end(). >> >> By setting the value of 'transition_task' again explicitly in _end(), we >> ensure that the code won't print a false-positive warning in this case. >> >> However the same code successfully catches the following deadlock-prone >> scenario even in ASYNC_NOTIFICATION drivers: >> >> Scenario 2: (Deadlock-prone) >> ---------- >> >> Task A Task B >> >> /* 1st freq transition */ >> Invoke _begin() { >> ... >> ... >> } >> >> /* 2nd freq transition */ >> Invoke _begin() { >> ... >> ... >> } >> >> Change the frequency >> >> >> Here the bug is that Task A called the second _begin() *before* actually >> performing the 1st frequency transition. In other words, it failed to set >> Task B in motion for the 1st frequency transition, and hence it will >> self-deadlock. This is very similar to the case of drivers which do >> synchronous notification, and hence the debug infrastructure developed >> in this patch can catch this scenario easily. >> >> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> >> --- >> >> drivers/cpufreq/cpufreq.c | 12 ++++++++++++ >> include/linux/cpufreq.h | 1 + >> 2 files changed, 13 insertions(+) >> >> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c >> index abda660..2c99a6c 100644 >> --- a/drivers/cpufreq/cpufreq.c >> +++ b/drivers/cpufreq/cpufreq.c >> @@ -354,6 +354,10 @@ static void cpufreq_notify_post_transition(struct cpufreq_policy *policy, >> void cpufreq_freq_transition_begin(struct cpufreq_policy *policy, >> struct cpufreq_freqs *freqs) >> { >> + >> + /* Catch double invocations of _begin() which lead to self-deadlock */ >> + WARN_ON(current == policy->transition_task); >> + >> wait: >> wait_event(policy->transition_wait, !policy->transition_ongoing); >> >> @@ -365,6 +369,7 @@ wait: >> } >> >> policy->transition_ongoing = true; >> + policy->transition_task = current; >> >> spin_unlock(&policy->transition_lock); >> >> @@ -378,9 +383,16 @@ void cpufreq_freq_transition_end(struct cpufreq_policy *policy, >> if (unlikely(WARN_ON(!policy->transition_ongoing))) >> return; >> >> + /* >> + * The task invoking _end() could be different from the one that >> + * invoked the _begin(). So set ->transition_task again here >> + * explicity. >> + */ >> + policy->transition_task = current; >> cpufreq_notify_post_transition(policy, freqs, transition_failed); >> >> policy->transition_ongoing = false; >> + policy->transition_task = NULL; >> >> wake_up(&policy->transition_wait); >> } >> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h >> index 5ae5100..8f44d79 100644 >> --- a/include/linux/cpufreq.h >> +++ b/include/linux/cpufreq.h >> @@ -110,6 +110,7 @@ struct cpufreq_policy { >> bool transition_ongoing; /* Tracks transition status */ >> spinlock_t transition_lock; >> wait_queue_head_t transition_wait; >> + struct task_struct *transition_task; /* Task which is doing the transition */ >> }; >> >> /* Only for ACPI */ >> -- To unsubscribe from this list: send the line "unsubscribe cpufreq" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html