On 05/05/2022 06:40, Vinay Belgaumkar wrote:
SLPC min/max frequency updates require H2G calls. We are seeing
timeouts when GuC channel is backed up and it is unable to respond
in a timely fashion causing warnings and affecting CI.
Is it the "Unable to force min freq" error? Do you have a link to the
GitLab issue to add to commit message?
This is seen when waitboosting happens during a stress test.
this patch updates the waitboost path to use a non-blocking
H2G call instead, which returns as soon as the message is
successfully transmitted.
AFAIU with this approach, when CT channel is congested, you instead
achieve silent dropping of the waitboost request, right?
It sounds like a potentially important feedback from the field to lose
so easily. How about you added drm_notice to the worker when it fails?
Or simply a "one line patch" to replace i915_probe_error (!?) with
drm_notice and keep the blocking behavior. (I have no idea what is the
typical time to drain the CT buffer, and so to decide whether waiting or
dropping makes more sense for effectiveness of waitboosting.)
Or since the congestion /should not/ happen in production, then the
argument is why complicate with more code, in which case going with one
line patch is an easy way forward?
Regards,
Tvrtko
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@xxxxxxxxx>
---
drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 38 ++++++++++++++++-----
1 file changed, 30 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index 1db833da42df..c852f73cf521 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -98,6 +98,30 @@ static u32 slpc_get_state(struct intel_guc_slpc *slpc)
return data->header.global_state;
}
+static int guc_action_slpc_set_param_nb(struct intel_guc *guc, u8 id, u32 value)
+{
+ u32 request[] = {
+ GUC_ACTION_HOST2GUC_PC_SLPC_REQUEST,
+ SLPC_EVENT(SLPC_EVENT_PARAMETER_SET, 2),
+ id,
+ value,
+ };
+ int ret;
+
+ ret = intel_guc_send_nb(guc, request, ARRAY_SIZE(request), 0);
+
+ return ret > 0 ? -EPROTO : ret;
+}
+
+static int slpc_set_param_nb(struct intel_guc_slpc *slpc, u8 id, u32 value)
+{
+ struct intel_guc *guc = slpc_to_guc(slpc);
+
+ GEM_BUG_ON(id >= SLPC_MAX_PARAM);
+
+ return guc_action_slpc_set_param_nb(guc, id, value);
+}
+
static int guc_action_slpc_set_param(struct intel_guc *guc, u8 id, u32 value)
{
u32 request[] = {
@@ -208,12 +232,10 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq)
*/
with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
- ret = slpc_set_param(slpc,
- SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
- freq);
- if (ret)
- i915_probe_error(i915, "Unable to force min freq to %u: %d",
- freq, ret);
+ /* Non-blocking request will avoid stalls */
+ ret = slpc_set_param_nb(slpc,
+ SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
+ freq);
}
return ret;
@@ -231,8 +253,8 @@ static void slpc_boost_work(struct work_struct *work)
*/
mutex_lock(&slpc->lock);
if (atomic_read(&slpc->num_waiters)) {
- slpc_force_min_freq(slpc, slpc->boost_freq);
- slpc->num_boosts++;
+ if (!slpc_force_min_freq(slpc, slpc->boost_freq))
+ slpc->num_boosts++;
}
mutex_unlock(&slpc->lock);
}