Since GPU reset is not enabled yet anyway, a timeout will just print a message, can we just change amdgpu_lockup_timeout to MAX_SCHEDULE_TIMEOUT until we enable GPU reset? Alex ________________________________ From: Evan Quan <evan.quan@xxxxxxx> Sent: Friday, March 16, 2018 12:52:32 AM To: amd-gfx at lists.freedesktop.org Cc: Deucher, Alexander; Quan, Evan Subject: [PATCH] drm/amdgpu: no job timeout setting on compute queues Under some heavy computing test(dgemm) environment, it may takes the asic over 50+ seconds to finish the dispatched single job which will trigger the timeout. It's quite annoying although it does not seem to bring any real problems. As a quick workround, we choose to not enfoce the timeout setting on compute queues. Change-Id: I210011a90898617367e897a90e9f8fb2639281a3 Signed-off-by: Evan Quan <evan.quan at amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 008e198..455a81e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -435,7 +435,9 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring, if (ring->funcs->type != AMDGPU_RING_TYPE_KIQ) { r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, num_hw_submission, amdgpu_job_hang_limit, - msecs_to_jiffies(amdgpu_lockup_timeout), ring->name); + (ring->funcs->type == AMDGPU_RING_TYPE_COMPUTE) ? + MAX_SCHEDULE_TIMEOUT : msecs_to_jiffies(amdgpu_lockup_timeout), + ring->name); if (r) { DRM_ERROR("Failed to create scheduler on ring %s.\n", ring->name); -- 2.7.4 -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180316/bc41d07c/attachment.html>