Re: [PATCH] drm/i915/gt: Clear wedged status upon suspend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Rodrigo,

On 1/24/2023 8:26 PM, Rodrigo Vivi wrote:
On Tue, Jan 24, 2023 at 12:07:19PM +0100, Das, Nirmoy wrote:
Forgot to add the drm issue a reference.

On 1/24/2023 12:05 PM, Nirmoy Das wrote:
From: Chris Wilson <chris.p.wilson@xxxxxxxxxxxxxxx>

Currently we use set-wedged on suspend if the workload is not responding
in order to allow a fast suspend (albeit at the cost of discarding the
current userspace). This may leave the device wedged during suspend,
where we may require the device available in order to swapout CPU
inaccessible device memory. Clear any temporary wedged-status after
flushing userspace off the device so we can use the blitter ourselves
inside suspend.
This seems a very good move. But this explain they unset_wedged part,
not the removal of the retire_requests. Why don't we need to retire them
anymore?


Thanks for noticing that. This on me, I missed another patch which moved the intel_gt_retire_requests()

inside of intel_gt_set_wedged().


Also, what are the chances of races here? I mean, we are marking
the gpu as not wedged anymore. Do we have any warranty at this point
that no further request will arrive?


The assumption was: this is  in single threaded suspend "context" so we should be fine but

we just realized that  this is getting called at pm prepare time. Thanks for raising this it seem

I need to refactor i915_gem_backup_suspend() as well which should be called much later on.


Regards,

Nirmoy


Shouldn't we have a way to differentiate between the totally wedged
and blocked for user submission?

Testcase: igt/gem_eio/in-flight-suspend
References: https://gitlab.freedesktop.org/drm/intel/-/issues/7896
Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Signed-off-by: Chris Wilson <chris.p.wilson@xxxxxxxxxxxxxxx>
Signed-off-by: Nirmoy Das <nirmoy.das@xxxxxxxxx>
---
   drivers/gpu/drm/i915/gt/intel_gt_pm.c | 10 ++++------
   1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index cef3d6f5c34e..74d1dd3793f9 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -317,19 +317,17 @@ int intel_gt_resume(struct intel_gt *gt)
   static void wait_for_suspend(struct intel_gt *gt)
   {
-	if (!intel_gt_pm_is_awake(gt))
-		return;
-
-	if (intel_gt_wait_for_idle(gt, I915_GT_SUSPEND_IDLE_TIMEOUT) == -ETIME) {
+	if (intel_gt_wait_for_idle(gt, I915_GT_SUSPEND_IDLE_TIMEOUT) == -ETIME)
   		/*
   		 * Forcibly cancel outstanding work and leave
   		 * the gpu quiet.
   		 */
   		intel_gt_set_wedged(gt);
-		intel_gt_retire_requests(gt);
-	}
   	intel_gt_pm_wait_for_idle(gt);
+
+	/* Make the GPU available again for swapout */
+	intel_gt_unset_wedged(gt);
   }
   void intel_gt_suspend_prepare(struct intel_gt *gt)



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux