Well using this is in sysfs is a bug to begin with. This would prevent
starting new applications and crashing applications which don't expect
to get an -EPERM in return here.
If we need to make operations mutual exclusive with resets then we need
to take the appropriate locks and *not* work around by abusing
amdgpu_in_reset().
The functionality of amdgpu_in_reset() is just to check in lower level
functions if we are inside the higher level reset thread and *not*
protect anybody from concurrent access.
I think we should probably completely nuke the underlying flag and using
the thread owner of the lock to prevent such an abuse.
Regards,
Christian.
Am 12.02.24 um 21:56 schrieb Deucher, Alexander:
[AMD Official Use Only - General]
Ping?
-----Original Message-----
From: Deucher, Alexander <Alexander.Deucher@xxxxxxx>
Sent: Monday, January 29, 2024 10:56 AM
To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>
Subject: [PATCH] drm/amdgpu: bail on INFO IOCTL if the GPU is in reset
This avoids queries to read registers or query the SMU for telemetry data while
the GPU is in reset. This mirrors what we already do for sysfs.
Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index a2df3025a754..d522e99c6f81 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -607,6 +607,9 @@ int amdgpu_info_ioctl(struct drm_device *dev, void
*data, struct drm_file *filp)
int i, found, ret;
int ui32_size = sizeof(ui32);
+ if (amdgpu_in_reset(adev))
+ return -EPERM;
+
if (!info->return_size || !info->return_pointer)
return -EINVAL;
--
2.42.0