On 3/21/2022 3:08 PM, Tao Zhou wrote:
Print the status out when it passes, and also tell user gpu reset
is triggered when we fallback to legacy way.
v2: make the message more explicitly.
Signed-off-by: Tao Zhou <tao.zhou1@xxxxxxx>
---
drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
index 56902b5bb7b6..32c451f21db7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
@@ -105,8 +105,6 @@ static void event_interrupt_poison_consumption(struct kfd_dev *dev,
if (old_poison)
return;
- pr_warn("RAS poison consumption handling: client id %d\n", client_id);
-
switch (client_id) {
case SOC15_IH_CLIENTID_SE0SH:
case SOC15_IH_CLIENTID_SE1SH:
@@ -130,10 +128,15 @@ static void event_interrupt_poison_consumption(struct kfd_dev *dev,
/* resetting queue passes, do page retirement without gpu reset
* resetting queue fails, fallback to gpu reset solution
*/
- if (!ret)
+ if (!ret) {
+ pr_warn("RAS poison consumption, unmap queue flow succeeds: client id %d\n",
+ client_id);
As discussed in another patch, I understand that pr_* is the legacy
usage in the file. But it won't be helpful for this case with multiple
devices. Would suggest to change to dev_info() - the message here and
below seems informational about the handling of this situation rather
than warning of something bad.
Thanks,
Lijo
amdgpu_amdkfd_ras_poison_consumption_handler(dev->adev, false);
- else
+ } else {
+ pr_warn("RAS poison consumption, fallback to gpu reset flow: client id %d\n",
+ client_id);
amdgpu_amdkfd_ras_poison_consumption_handler(dev->adev, true);
+ }
}
static bool event_interrupt_isr_v9(struct kfd_dev *dev,