Re: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It's just an infrastructure you use when you need.
I never tested it during reset i think but, we deliberately did it very self reliant where you simply iterate a FIFO of the dump through PMI3 registers interface and dump out the content. It currently supposed to work for the NV family.

In case you encounter issues during reset let me know and I will do my best to resolve them.

Andrey

On 2022-01-24 11:38, Sharma, Shashank wrote:
Hey Andrey,
That seems like a good idea, may I know if there is a trigger for STB dump ? or is it just the infrastructure which one can use when they feel a need to dump info ? Also, how reliable is the STB infra during a reset ?

Regards
Shashank
On 1/24/2022 5:32 PM, Andrey Grodzovsky wrote:
You probably can add the STB dump we worked on a while ago to your info dump - a reminder on the feature is here https://www.spinics.net/lists/amd-gfx/msg70751.html

Andrey

On 2022-01-21 15:34, Sharma, Shashank wrote:
From 899ec6060eb7d8a3d4d56ab439e4e6cdd74190a4 Mon Sep 17 00:00:00 2001
From: Somalapuram Amaranath <Amaranath.Somalapuram@xxxxxxx>
Date: Fri, 21 Jan 2022 14:19:42 +0530
Subject: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler

This patch adds a GPU reset handler for Navi ASIC family, which
typically dumps some of the registersand sends a trace event.

V2: Accomodated call to work function to send uevent

Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@xxxxxxx>
Signed-off-by: Shashank Sharma <shashank.sharma@xxxxxxx>
---
 drivers/gpu/drm/amd/amdgpu/nv.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 01efda4398e5..ada35d4c5245 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -528,10 +528,38 @@ nv_asic_reset_method(struct amdgpu_device *adev)
     }
 }

+static void amdgpu_reset_dumps(struct amdgpu_device *adev)
+{
+    int r = 0, i;
+
+    /* original raven doesn't have full asic reset */
+    if ((adev->apu_flags & AMD_APU_IS_RAVEN) &&
+        !(adev->apu_flags & AMD_APU_IS_RAVEN2))
+        return;
+    for (i = 0; i < adev->num_ip_blocks; i++) {
+        if (!adev->ip_blocks[i].status.valid)
+            continue;
+        if (!adev->ip_blocks[i].version->funcs->reset_reg_dumps)
+            continue;
+        r = adev->ip_blocks[i].version->funcs->reset_reg_dumps(adev);
+
+        if (r)
+            DRM_ERROR("reset_reg_dumps of IP block <%s> failed %d\n",
+ adev->ip_blocks[i].version->funcs->name, r);
+    }
+
+    /* Schedule work to send uevent */
+    if (!queue_work(system_unbound_wq, &adev->gpu_reset_work))
+        DRM_ERROR("failed to add GPU reset work\n");
+
+    dump_stack();
+}
+
 static int nv_asic_reset(struct amdgpu_device *adev)
 {
     int ret = 0;

+    amdgpu_reset_dumps(adev);
     switch (nv_asic_reset_method(adev)) {
     case AMD_RESET_METHOD_PCI:
         dev_info(adev->dev, "PCI reset\n");



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux