Re: [PATCH 1/3] drm/v3d: Don't resubmit guilty CSD jobs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've tested the patch and confirmed that applies correctly over drm-next.

I've also confirmed that the timeout happens with the described test case by the developer.

https://github.com/raspberrypi/linux/pull/3816#issuecomment-682251862

Considering this is my first review of a patch in v3d kernel side I think this patch is fine.

Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@xxxxxxxxxx>

On 3/9/20 18:48, Yukimasa Sugizaki wrote:
From: Yukimasa Sugizaki <ysugi@xxxxxxxx>

The previous code misses a check for the timeout error set by
drm_sched_resubmit_jobs(), which results in an infinite GPU reset loop
if once a timeout occurs:

[  178.799106] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* Resetting GPU for hang.
[  178.807836] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* V3D_ERR_STAT: 0x00001000
[  179.839132] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* Resetting GPU for hang.
[  179.847865] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* V3D_ERR_STAT: 0x00001000
[  180.879146] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* Resetting GPU for hang.
[  180.887925] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* V3D_ERR_STAT: 0x00001000
[  181.919188] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* Resetting GPU for hang.
[  181.928002] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* V3D_ERR_STAT: 0x00001000
...

This commit adds the check for timeout as in v3d_{bin,render}_job_run():

[   66.408962] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* Resetting GPU for hang.
[   66.417734] v3d fec00000.v3d: [drm:v3d_reset [v3d]] *ERROR* V3D_ERR_STAT: 0x00001000
[   66.428296] [drm] Skipping CSD job resubmission due to previous error (-125)

, where -125 is -ECANCELED, though users currently have no way other
than inspecting the dmesg to check if the timeout has occurred.

Signed-off-by: Yukimasa Sugizaki <ysugi@xxxxxxxx>
---
  drivers/gpu/drm/v3d/v3d_sched.c | 11 +++++++++++
  1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 0747614a78f0..001216f22017 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -226,6 +226,17 @@ v3d_csd_job_run(struct drm_sched_job *sched_job)
  	struct dma_fence *fence;
  	int i;

+	/* This error is set to -ECANCELED by drm_sched_resubmit_jobs() if this
+	 * job timed out more than sched_job->sched->hang_limit times.
+	 */
+	int error = sched_job->s_fence->finished.error;
+
+	if (unlikely(error < 0)) {
+		DRM_WARN("Skipping CSD job resubmission due to previous error (%d)\n",
+			 error);
+		return ERR_PTR(error);
+	}
+
  	v3d->csd_job = job;

  	v3d_invalidate_caches(v3d);
--
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel



[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux