Re: [Intel-gfx] [PATCH] drm/i915/slpc: Optmize waitboost for SLPC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/19/2022 12:40 AM, Tvrtko Ursulin wrote:

On 18/10/2022 23:15, Vinay Belgaumkar wrote:
Waitboost (when SLPC is enabled) results in a H2G message. This can result in thousands of messages during a stress test and fill up an already full CTB. There is no need to request for RP0 if GuC is already requesting the
same.

Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@xxxxxxxxx>
---
  drivers/gpu/drm/i915/gt/intel_rps.c | 9 ++++++++-
  1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
index fc23c562d9b2..a20ae4fceac8 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1005,13 +1005,20 @@ void intel_rps_dec_waiters(struct intel_rps *rps)
  void intel_rps_boost(struct i915_request *rq)
  {
      struct intel_guc_slpc *slpc;
+    struct intel_rps *rps = &READ_ONCE(rq->engine)->gt->rps;
        if (i915_request_signaled(rq) || i915_request_has_waitboost(rq))
          return;
  +    /* If GuC is already requesting RP0, skip */
+    if (rps_uses_slpc(rps)) {
+        slpc = rps_to_slpc(rps);
+        if (intel_rps_get_requested_frequency(rps) == slpc->rp0_freq)
One correction here is this should be slpc->boost_freq.
+            return;
+    }
+

Feels a little bit like a layering violation. Wait boost reference counts and request markings will changed based on asynchronous state - a mmio read.

Also, a little below we have this:

"""
    /* Serializes with i915_request_retire() */
    if (!test_and_set_bit(I915_FENCE_FLAG_BOOST, &rq->fence.flags)) {
        struct intel_rps *rps = &READ_ONCE(rq->engine)->gt->rps;

        if (rps_uses_slpc(rps)) {
            slpc = rps_to_slpc(rps);

            /* Return if old value is non zero */
            if (!atomic_fetch_inc(&slpc->num_waiters))

***>>>> Wouldn't it skip doing anything here already? <<<<***
It will skip only if boost is already happening. This patch is trying to prevent even that first one if possible.

                schedule_work(&slpc->boost_work);

            return;
        }

        if (atomic_fetch_inc(&rps->num_waiters))
            return;
"""

But I wonder if this is not a layering violation already. Looks like one for me at the moment. And as it happens there is an ongoing debug of clvk slowness where I was a bit puzzled by the lack of "boost fence" in trace_printk logs - but now I see how that happens. Does not feel right to me that we lose that tracing with SLPC.
Agreed. Will add the trace to the SLPC case as well.  However, the question is what does that trace indicate? Even in the host case, we log the trace, but may skip the actual boost as the req is already matching boost freq. IMO, we should log the trace only when we actually decide to boost.

So in general - why the correct approach wouldn't be to solve this in the worker - which perhaps should fork to slpc specific branch and do the consolidations/skips based on mmio reads in there?

sure, I can move the mmio read to the SLPC worker thread.

Thanks,

Vinay.


Regards,

Tvrtko

      /* Serializes with i915_request_retire() */
      if (!test_and_set_bit(I915_FENCE_FLAG_BOOST, &rq->fence.flags)) {
-        struct intel_rps *rps = &READ_ONCE(rq->engine)->gt->rps;
            if (rps_uses_slpc(rps)) {
              slpc = rps_to_slpc(rps);



[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux