It's just an infrastructure you use when you need.
I never tested it during reset i think but, we deliberately did it very
self reliant where you simply iterate a FIFO of the dump through PMI3
registers interface and dump out the content. It currently supposed to
work for the NV family.
In case you encounter issues during reset let me know and I will do my
best to resolve them.
Andrey
On 2022-01-24 11:38, Sharma, Shashank wrote:
Hey Andrey,
That seems like a good idea, may I know if there is a trigger for STB
dump ? or is it just the infrastructure which one can use when they
feel a need to dump info ? Also, how reliable is the STB infra during
a reset ?
Regards
Shashank
On 1/24/2022 5:32 PM, Andrey Grodzovsky wrote:
You probably can add the STB dump we worked on a while ago to your
info dump - a reminder
on the feature is here
https://www.spinics.net/lists/amd-gfx/msg70751.html
Andrey
On 2022-01-21 15:34, Sharma, Shashank wrote:
From 899ec6060eb7d8a3d4d56ab439e4e6cdd74190a4 Mon Sep 17 00:00:00 2001
From: Somalapuram Amaranath <Amaranath.Somalapuram@xxxxxxx>
Date: Fri, 21 Jan 2022 14:19:42 +0530
Subject: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler
This patch adds a GPU reset handler for Navi ASIC family, which
typically dumps some of the registersand sends a trace event.
V2: Accomodated call to work function to send uevent
Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@xxxxxxx>
Signed-off-by: Shashank Sharma <shashank.sharma@xxxxxxx>
---
drivers/gpu/drm/amd/amdgpu/nv.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c
b/drivers/gpu/drm/amd/amdgpu/nv.c
index 01efda4398e5..ada35d4c5245 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -528,10 +528,38 @@ nv_asic_reset_method(struct amdgpu_device *adev)
}
}
+static void amdgpu_reset_dumps(struct amdgpu_device *adev)
+{
+ int r = 0, i;
+
+ /* original raven doesn't have full asic reset */
+ if ((adev->apu_flags & AMD_APU_IS_RAVEN) &&
+ !(adev->apu_flags & AMD_APU_IS_RAVEN2))
+ return;
+ for (i = 0; i < adev->num_ip_blocks; i++) {
+ if (!adev->ip_blocks[i].status.valid)
+ continue;
+ if (!adev->ip_blocks[i].version->funcs->reset_reg_dumps)
+ continue;
+ r = adev->ip_blocks[i].version->funcs->reset_reg_dumps(adev);
+
+ if (r)
+ DRM_ERROR("reset_reg_dumps of IP block <%s> failed %d\n",
+ adev->ip_blocks[i].version->funcs->name, r);
+ }
+
+ /* Schedule work to send uevent */
+ if (!queue_work(system_unbound_wq, &adev->gpu_reset_work))
+ DRM_ERROR("failed to add GPU reset work\n");
+
+ dump_stack();
+}
+
static int nv_asic_reset(struct amdgpu_device *adev)
{
int ret = 0;
+ amdgpu_reset_dumps(adev);
switch (nv_asic_reset_method(adev)) {
case AMD_RESET_METHOD_PCI:
dev_info(adev->dev, "PCI reset\n");