Re: [PATCH 2/2] drm/amdgpu: add AMDGPURESET uevent on AMD GPU reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 1/17/2022 12:03 PM, Somalapuram Amaranath wrote:
AMDGPURESET uevent added to notify userspace, collect dump_stack and trace

Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@xxxxxxx>
---
  drivers/gpu/drm/amd/amdgpu/nv.c | 45 +++++++++++++++++++++++++++++++++
  1 file changed, 45 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 2ec1ffb36b1f..b73147ae41fb 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -529,10 +529,55 @@ nv_asic_reset_method(struct amdgpu_device *adev)
  	}
  }
+/**
+ * drm_sysfs_reset_event - generate a DRM uevent
+ * @dev: DRM device
+ *
+ * Send a uevent for the DRM device specified by @dev.  Currently we only
+ * set AMDGPURESET=1 in the uevent environment, but this could be expanded to
+ * deal with other types of events.
+ *
+ * Any new uapi should be using the drm_sysfs_connector_status_event()
+ * for uevents on connector status change.
+ */
+void drm_sysfs_reset_event(struct drm_device *dev)
+{
+	char *event_string = "AMDGPURESET=1";
+	char *envp[2] = { event_string, NULL };
+
+	kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp);
+}
+
+void amdgpu_reset_dumps(struct amdgpu_device *adev)
+{
+	struct drm_device *ddev = adev_to_drm(adev);
+	int r = 0, i;
+
+	/* original raven doesn't have full asic reset */
+	if ((adev->apu_flags & AMD_APU_IS_RAVEN) &&
+		!(adev->apu_flags & AMD_APU_IS_RAVEN2))
+		return;
+	for (i = 0; i < adev->num_ip_blocks; i++) {
+		if (!adev->ip_blocks[i].status.valid)
+			continue;
+		if (!adev->ip_blocks[i].version->funcs->reset_reg_dumps)
+			continue;
+		r = adev->ip_blocks[i].version->funcs->reset_reg_dumps(adev);
+
+		if (r)
+			DRM_ERROR("reset_reg_dumps of IP block <%s> failed %d\n",
+					adev->ip_blocks[i].version->funcs->name, r);
+	}
+
+	drm_sysfs_reset_event(ddev);
+	dump_stack();
+}
+
  static int nv_asic_reset(struct amdgpu_device *adev)
  {
  	int ret = 0;
+ amdgpu_reset_dumps(adev);

Alex recently added a patch to reset GPU on suspend. It doesn't make sense to send an event in such cases, guess the original intention is for gpu recovery related cases.

Thanks,
Lijo

  	switch (nv_asic_reset_method(adev)) {
  	case AMD_RESET_METHOD_PCI:
  		dev_info(adev->dev, "PCI reset\n");




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux