Re: [PATCH] drm/i915: Split I915_RESET_IN_PROGRESS into two flags

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 23/02/17 08:59, Chris Wilson wrote:
I915_RESET_IN_PROGRESS is being used for both signaling the requirement
to i915_mutex_lock_interruptible() to avoid taking the struct_mutex and
to instruct a waiter (already holding the struct_mutex) to perform the
reset. To allow for a little more coordination, split these two meaning
into a couple of distinct flags. I915_RESET_BACKOFF tells
i915_mutex_lock_interruptible() not to acquire the mutex and
I915_RESET_HANDOFF tells the waiter to call i915_reset().

Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
---
This is part of a much bigger problem to try and restore the balance
between atomic modeset and resets. However, now that the waiter has
been revamped, this patch should help us ease forward with TDR.
-Chris
---

...

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index eed9ead1b592..7e9b1a008134 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h


reset_count will have a stale comment now, i.e.:

@@ -1558,7 +1558,7 @@ struct i915_gpu_error {
         *
* This is a counter which gets incremented when reset is triggered,
         *
-        * Before the reset commences, the I915_RESET_IN_PROGRESS bit is set
+        * Before the reset commences, the I915_RESET_BACKOFF bit is set
         * meaning that any waiters holding onto the struct_mutex should
         * relinquish the lock immediately in order for the reset to start.
         *


@@ -1578,8 +1578,33 @@ struct i915_gpu_error {
         */
        unsigned long reset_count;

+       /**
+        * flags: Control various stages of the GPU reset
+        *
+        * #I915_RESET_BACKOFF - When we start a reset, we want to stop any
+        * other users acquiring the struct_mutex. To do this we set the
+        * #I915_RESET_BACKOFF bit in the error flags when we detect a reset
+        * and then check for that bit before acquiring the struct_mutex (in
+        * i915_mutex_lock_interruptible()?). I915_RESET_BACKOFF serves a
+        * secondary role in preventing two concurrent global reset attempts.
+        *
+        * #I915_RESET_HANDOFF - To perform the actual GPU reset, we need the
+        * struct_mutex. We try to acquire the struct_mutex in the reset worker,
+        * but it may be held by some long running waiter (that we cannot
+        * interrupt without causing trouble). Once we are ready to do the GPU
+        * reset, we set the I915_RESET_HANDOFF bit and wakeup any waiters. If
+        * they already hold the struct_mutex and want to participate they can
+        * inspect the bit and do the reset directly, otherwise the worker
+        * waits for the struct_mutex.
+        *
+        * #I915_WEDGED - If reset fails and we can no longer use the GPU,
+        * we set the #I915_WEDGED bit. Prior to command submission, e.g.
+        * i915_gem_request_alloc(), this bit is checked and the sequence
+        * aborted (with -EIO reported to userspace) if set.
+        */
        unsigned long flags;
-#define I915_RESET_IN_PROGRESS 0
+#define I915_RESET_BACKOFF     0
+#define I915_RESET_HANDOFF     1
 #define I915_WEDGED            (BITS_PER_LONG - 1)

        /**

I've been looking fwd this change,


Acked-by: Michel Thierry <michel.thierry@xxxxxxxxx>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux