Re: [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 19/11/2019 16:20, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2019-11-19 15:04:46)

On 18/11/2019 23:02, Chris Wilson wrote:
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 33ce258d484f..f7c8fec436a9 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -142,6 +142,7 @@
   #include "intel_engine_pm.h"
   #include "intel_gt.h"
   #include "intel_gt_pm.h"
+#include "intel_gt_requests.h"
   #include "intel_lrc_reg.h"
   #include "intel_mocs.h"
   #include "intel_reset.h"
@@ -2278,6 +2279,18 @@ static void execlists_submission_tasklet(unsigned long data)
               if (timeout && preempt_timeout(engine))
                       preempt_reset(engine);
       }
+
+     /*
+      * If the GPU is currently idle, retire the outstanding completed
+      * requests. This will allow us to enter soft-rc6 as soon as possible,
+      * albeit at the cost of running the retire worker much more frequently
+      * (over the entire GT not just this engine) and emitting more idle
+      * barriers (i.e. kernel context switches unpinning all that went
+      * before) which may add some extra latency.
+      */
+     if (intel_engine_pm_is_awake(engine) &&
+         !execlists_active(&engine->execlists))
+             intel_gt_schedule_retire_requests(engine->gt);

I am still not a fan of doing this for all platforms.

I understand. I think it makes a fair amount of sense to do early
retires, and wish to pursue that if I can show there is no harm.

It's also a bit of a layering problem.

Its not just the cost of retirement but there is
intel_engine_flush_submission on all engines in there as well which we
cannot avoid triggering from this path.

Would it be worth experimenting with additional per-engine retire
workers? Most of the code could be shared, just a little bit of
specialization to filter on engine.

I haven't sketched out anything more than peeking at the last request on
the timeline and doing a rq->engine == engine filter. Walking the global
timeline.active_list in that case is also a nuisance.

That together with:

	flush_submission(gt, engine ? engine->mask : ALL_ENGINES);

Might be enough? At least to satisfy my concern.

Apart layering is still bad.. And I'd still limit it to when RC6 WA is active unless it can be shown there is no perf/power impact across GPU/CPU to do this everywhere.

At which point it becomes easier to just limit it because we have to have it there.

I also wonder if the current flush_submission wasn't the reason for performance regression you were seeing with this? It makes this tasklet wait for all other engines, if they are busy. But not sure.. perhaps it is work which would be done anyway.

There's definitely scope here for us using some more information from
process_csb() about which context completed and limit work to that
timeline. Hmm, something along those lines maybe...

But you want to retire all timelines which have work on this particular physical engine. Otherwise it doesn't get parked, no?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux