Re: [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized

"Srivatsa S. Bhat" <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> · Mon, 24 Mar 2014 12:18:30 +0530

On 03/21/2014 11:37 PM, Catalin Marinas wrote:
> On Fri, Mar 21, 2014 at 11:24:16AM +0000, Srivatsa S. Bhat wrote:
>> On 03/21/2014 04:35 PM, Catalin Marinas wrote:
>>> On Fri, Mar 21, 2014 at 09:21:02AM +0000, Viresh Kumar wrote:
>>>> @Catalin: We have a problem here and need your expert advice. After changing
>>>> CPU frequency we need to call this code:
>>>>
>>>> cpufreq_notify_post_transition();
>>>> policy->transition_ongoing = false;
>>>>
>>>> And the sequence must be like this only. Is this guaranteed without any
>>>> memory barriers? cpufreq_notify_post_transition() isn't touching
>>>> transition_ongoing at all..
>>>
>>> The above sequence doesn't say much. As rmk said, the compiler wouldn't
>>> reorder the transition_ongoing write before the function call. I think
>>> most architectures (not sure about Alpha) don't do speculative stores,
>>> so hardware wouldn't reorder them either. However, other stores inside
>>> the cpufreq_notify_post_transition() could be reordered after
>>> transition_ongoing store. The same for memory accesses after the
>>> transition_ongoing update, they could be reordered before.
>>>
>>> So what we actually need to know is what are the other relevant memory
>>> accesses that require strict ordering with transition_ongoing.
>>
>> Hmm.. The thing is, _everything_ inside the post_transition() function
>> should complete before writing to transition_ongoing. Because, setting the
>> flag to 'false' indicates the end of the critical section, and the next
>> contending task can enter the critical section.
> 
> smp_mb() is all about relative ordering. So if you want memory accesses
> in post_transition() to be visible to other observers before
> transition_ongoing = false, you also need to make sure that the readers
> of transition_ongoing have a barrier before subsequent memory accesses.
> 

The reader takes a spin-lock before reading the flag.. won't that suffice?

+wait:
+	wait_event(policy->transition_wait, !policy->transition_ongoing);
+
+	spin_lock(&policy->transition_lock);
+
+	if (unlikely(policy->transition_ongoing)) {
+		spin_unlock(&policy->transition_lock);
+		goto wait;
+	}

>>> What I find strange in your patch is that
>>> cpufreq_freq_transition_begin() uses spinlocks around transition_ongoing
>>> update but cpufreq_freq_transition_end() doesn't.
>>
>> The reason is that, by the time we drop the spinlock, we would have set
>> the transition_ongoing flag to true, which prevents any other task from
>> entering the critical section. Hence, when we call the _end() function,
>> we are 100% sure that only one task is executing it. Hence locks are not
>> necessary around that second update. In fact, that very update marks the
>> end of the critical section (which acts much like a spin_unlock(&lock)
>> in a "regular" critical section).
> 
> OK, I start to get it. Is there a risk of missing a wake_up event? E.g.
> one thread waking up earlier, noticing that transition is in progress
> and waiting indefinitely?
>

No, the only downside to having the CPU reorder the assignment to the
flag is that a new transition can begin while the old one is still
finishing up the frequency transition by calling the _post_transition()
notifiers.

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe cpufreq" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html