Hi Alex, INT_MAX is used instead of MAX_SCHEDULE_TIMEOUT(which we discussed in another mail thread) since the amdgpu_lockup_timeout is with data type int. Using MAX_SCHEDULE_TIMEOUT(data type:long) will get compile warnings. Regards, Evan -----Original Message----- From: Evan Quan [mailto:evan.quan@xxxxxxx] Sent: Monday, March 19, 2018 2:08 PM To: amd-gfx at lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Quan, Evan <Evan.Quan at amd.com> Subject: [PATCH] drm/amdgpu: disable job timeout on GPU reset disabled Since under some heavy computing environment(dgemm test), it takes the asic over 10+ seconds to finish the dispatched single job which will trigger the timeout. It's quite confusing although it does not seem to bring any real problems. As a quick workround, we choose to disable timeout when GPU reset is disabled. Change-Id: I3a95d856ba4993094dc7b6269649e470c5b053d2 Signed-off-by: Evan Quan <evan.quan at amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 8bd9c3f..9d6a775 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -861,6 +861,13 @@ static void amdgpu_device_check_arguments(struct amdgpu_device *adev) amdgpu_lockup_timeout = 10000; } + /* + * Disable timeout when GPU reset is disabled to avoid confusing + * timeout messages in the kernel log. + */ + if (amdgpu_gpu_recovery == 0 || amdgpu_gpu_recovery == -1) + amdgpu_lockup_timeout = INT_MAX; + adev->firmware.load_type = amdgpu_ucode_get_load_type(adev, amdgpu_fw_load_type); } -- 2.7.4