Re: Questions on the PID controller in the intel_pstate driver

Dirk Brandewie <dirk.brandewie@xxxxxxxxx> · Tue, 20 Aug 2013 10:53:15 -0700

On 08/11/2013 06:04 AM, Alexander E. Patrakov wrote:
Hello.

I have noticed that there is a PID controller in the intel_pstate
driver, and thus have questions related to it. Yes, I understand that,
by default, there is only a "proportional" term. I have only read the
code, but have not tried to modify it to see how it breaks.

The goal of this e-mail is to determine whether we can explain
https://bugzilla.kernel.org/show_bug.cgi?id=60727 and to audit the
driver for other bugs. If I worked at Intel, I would have asked the
same questions as a part of normal code review.

1. Usually, the output of the PID controller is directly used to
control the process (in our case, this, naively, would be the
performance MSR). However, in the current driver, the output of
pid_calc is first quantized (by fp_toint(result) at the very end of
the function) and then used as a number of steps to increase or
decrease the MSR value - i.e. essentially integraded. This integration
essentially makes a non-standard integral-doubleintegral-proportional
type of controller instead of PID (and, due to only the P term being
present by default, the result is ian integral-only controller with
the quantization step before the integral).

The fp_toint() call is to return an integer instead of the fixed point
floating point number used in pid_calc().  I am unclear how this is an
integration.

pid_calc returns an error value that the MRS needs to be adjusted by and
not an MSR value since there are MANY SKU's with different pstate ranges.

Why was the decision to quantize and then integrate the PID controller
output made, and not the other way round? Why is the explicit
integration step outside of the controller necessary at all, instead
of letting a normal PI controller output (with different coefficients
- i.e. leaving only the I term would get what we have now, minus
quantization) directly control the MSR value?

2. In intel_pstate_get_scaled_busy(), there is this line:

         busy_scaled = mul_fp(core_busy, div_fp(max_pstate, current_pstate));

What is the physical meaning of the result - i.e. why does core_busy
need to be upscaled by (max_pstate / current_pstate)? Why is this
result chosen as something to stabilize (by comparing to the setpoint
in the PID controller)?

The scaling is done to find out how busy the CPU is at the current P state.
i.e If the core is 60% busy and the P state is 60% of max then it is 100%
    busy in the P state and the P state should be increased

I am asking because I don't quite understand the logic. Is the goal to
ensure (by the PID regulator choosing the P-state) that the CPU is as
close to 97% busy as possible, under the default policy?

3. Why is 97 chosen as a setpoint? What was the reasoning behind
setting the setpoint to 109 before the change 2134ed4d6 (asking
because 109 does not make sense as the "CPU busy percentage" setpoint
in an integral-only controller, see question (1) why I am calling your
controller integral-only)?

The 97% value goes along with the change to scaling off of max_pstate
instead of turbo_pstate.  The 97 works for the narrow P state bin size approx
2.5% of max_pstate/frequency.

4. Was any "well-established" procedure like Ziegler-Nichols tuning
method tried to tune the coefficients of the PID controller? Were
there any failed attempts?

This algorithm is for tuning linear systems and the changes in load are
anything but linear.  Changing the operating point (MSR) is instantaneous
also not linear.

5. Why is a PID controller (a thing that needs tuning, testing for
stability, requires essentially-floating-point calculations and can
provoke "I don't understand" e-mails such as this one) used at all?
What were the arguments for choosing it over, e.g., the simple
heuristic similar to what is in cpufreq_conservative.c?

The PID was chosen after implementation and evaluation of a number of algorithms
the PID had best the best power efficiency.  Is the PID the end all probably
not but in my testing it was the best I could come up with that fit the range
of workloads mobile/desktop/datacenter.

Are the default tuning parameters optimal for all workloads?  Clearly not but
neither are the the tuning parameters for any other governor (that I am aware
of) they were chosen to give good power efficiency for a range of workloads
and processor SKUs

Note: I am not subscribed to the cpufreq list. Please CC: me on replies.

--
To unsubscribe from this list: send the line "unsubscribe cpufreq" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html