On 1/6/2018 5:53 AM, Yaodong Li wrote:
On 01/05/2018 02:15 AM, Sagar Arun Kamble wrote:
On 1/5/2018 3:22 AM, Yaodong Li wrote:
On 01/03/2018 10:10 PM, Sagar Arun Kamble wrote:
Since ring frequency programming needs consideration of both IA and
GT frequency requests I think keeping the logic
to program the ring frequency table in driver that monitors both
IA/GT busyness and power budgets like intel_ips
will be more appropriate. intel_ips is relying on global load
derived from all CPUs.
I understand that power awareness and busyness based policy might
be trickier but having that as tunable will give better flexibility.
By just looking into the current code, the way intel_ips checks gpu
busyness cannot reflect the actual workload of GT
(e.g. gpu busy is true even if there's only one pending request), in
this case, we shall not increase the ring freq if we
want to use a "workload monitoring" based solution. so we need a
more accurate way to monitor the current GT workload
(e.g. when the pending request count reaches a center tunable
threshold??).
Yes. May be we can share the PMU data about engine busyness with
intel_ips.
Thank you Sagar! Can you tell more about how we can get the gpu
busyness from PMU data?
It seems to be not possible as of now since pmu is reporting accumulated
busy time per engine without info about time they became active/idle,
unless I am missing something. Also having pmu ON in release environment
needs to be checked. Chris, could you please clarify on pmu usage here.
Also thinking about below two options to get the GPU busyness:
1. Sharing i915 rps power zone (LOW_POWER, BETWEEN, HIGH_POWER) to
intel_ips (kind of indicators of GPU load)
2. Sampling GT C0 counter to derive full GPU busyness. This though may
not have been tested much so far across platforms.
I think the solution would be to set the ring freq table to use a ring
freq >= current ia freq (for all possible gpu freq) once we found
gpu workload is high (need to tune the threshold), and we will
decrease the ring freq (use a 2x multiplier?) once we found the GT
workload is low. The benefit to use the intel_ips is it can tell both
the cpu & gpu busyness. However, we do need an accurate way
to check at least the busyness of gpu for this issue.
Agree.
Thanks
Sagar
On 1/3/2018 11:51 PM, Yaodong Li wrote:
You are thinking of plugging into intel_pstate to make it
smarter for ia freq transitions?
Yep. This seems a correct step to give some automatic support
instead of parameter/hardcoded multiplier.
Does this mean we should use cpufreq/intel_pstate based approach
instead of the current modparam solution for Gen9?
Some concerns and questions about intel_pstate approach:
a) Currently, we cannot get the accurate pstate/target freq value
from cpufreq in intel_pstate active mode since
these values won't be exported to cpufreq layer, so if we
won't change intel_pstate code then we only can get
the max cpu freq of a new policy.
b) intel_pstate policy is attached to each logic cpu, which means
we will receive policy/freq transition notification
for each logic cpu freq change. One question is how we are
going to decide the freq of the ring? just use the max
cpu freq reported?
c) With the intel_pstate approach we may still run into thermal
throttling, in this case, can a certain cooling device
be triggered to lower the cpu freq?
Thanks and Regards,
-Jackie
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx