On 03/09/2017 12:54 PM, Chris Wilson wrote:
On Thu, Mar 09, 2017 at 05:02:16PM +0000, Tvrtko Ursulin wrote:
On 09/03/2017 08:55, Oscar Mateo wrote:
On 03/09/2017 08:50 AM, Tvrtko Ursulin wrote:
On 09/03/2017 08:42, Oscar Mateo wrote:
On 03/09/2017 02:05 AM, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
In order to ensure no missed interrupts we must first re-direct
the interrupts to GuC, and only then re-submit the requests to
be replayed after a GPU reset. Otherwise context switch can fire
before GuC has been set up to receive it triggering more hangs.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Michal Wajdeczko <michal.wajdeczko@xxxxxxxxx>
Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx>
Cc: Oscar Mateo <oscar.mateo@xxxxxxxxx>
Cc: Sagar Arun Kamble <sagar.a.kamble@xxxxxxxxx>
Cc: Arkadiusz Hiler <arkadiusz.hiler@xxxxxxxxx>
---
drivers/gpu/drm/i915/i915_guc_submission.c | 58
+++++++++++++++++++++++++++---
drivers/gpu/drm/i915/intel_guc_loader.c | 47
------------------------
2 files changed, 54 insertions(+), 51 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index beb38e30d0e9..5b8ec0fab563 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -936,6 +936,52 @@ static void guc_reset_wq(struct i915_guc_client
*client)
client->wq_tail = 0;
}
<SNIP>
int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
{
struct intel_guc *guc = &dev_priv->guc;
@@ -953,13 +999,17 @@ int i915_guc_submission_enable(struct
drm_i915_private *dev_priv)
/* Take over from manual control of ELSP (execlists) */
for_each_engine(engine, dev_priv, id) {
- const int wqi_size = sizeof(struct guc_wq_item);
- struct drm_i915_gem_request *rq;
-
engine->submit_request = i915_guc_submit;
engine->schedule = NULL;
+ }
+
+ guc_interrupts_capture(dev_priv);
+
+ /* Replay the current set of previously submitted requests */
+ for_each_engine(engine, dev_priv, id) {
+ const int wqi_size = sizeof(struct guc_wq_item);
+ struct drm_i915_gem_request *rq;
Don't you want to move the guc_interrupts_release to
i915_guc_submission_disable as well?
I can't spot anything broken in that path. We never go in that
direction with the live submission so why do you think it is needed?
Regards,
Tvrtko
Just code symmetry: if we are leaving i915_guc_submission_enable to
capture the interrupts, why doesn't the disable also releases them?
Maybe it's not very important now, but it makes a lot more sense with my
series to do proper unwinding of the whole path (I can tackle it there
if you prefer).
I think so. There is a multitude of people trying to refactor the
GuC code at the moment so I would prefer just to fix the reset fail
quickly and not interfere with that wider refactoring too much.
Because I think it is not just a quick job of moving the interrupt
release to get to full symmetry. Ack to merge then?
Exactly, I don't disagree with the desire to make/keep the code
symmetrical, but I also think push the fix and wait for the dust to
settle to fix the otherside, or volunteer somebody...
Just so long as we remember to do it in the short term and not leave
nits to build up.
-Chris
Ok, push the button then. I'll make it symmetrical later.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx