Re: [PATCH 4/4] drm/xe/xe_drm_client: Add per drm client reset stats

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 14/02/2025 20:37, Jonathan Cavitt wrote:
Add a counter to xe_drm_client that tracks the number of times the
engine has been reset since the drm client was created.

Signed-off-by: Jonathan Cavitt <jonathan.cavitt@xxxxxxxxx>
---
  drivers/gpu/drm/xe/xe_drm_client.c | 2 ++
  drivers/gpu/drm/xe/xe_drm_client.h | 2 ++
  drivers/gpu/drm/xe/xe_guc_submit.c | 4 +++-
  3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_drm_client.c b/drivers/gpu/drm/xe/xe_drm_client.c
index f15560d0b6ff..ecd2ce99fd19 100644
--- a/drivers/gpu/drm/xe/xe_drm_client.c
+++ b/drivers/gpu/drm/xe/xe_drm_client.c
@@ -492,6 +492,8 @@ static void show_blames(struct drm_printer *p, struct drm_file *file)
client = xef->client; + drm_printf(p, "drm-client-reset-count:%u\n",
+		   atomic_read(&client->reset_count));

When drm- prefix is used keys have to be agreed in drm-usage-stats.rst. Therefore I suggest exploring across different drivers and seeing if anyone else would be interested. Maybe people who worked on the DRM common wedged event for example.

Or in cases when new stats are not universally useful drivers can prefix with xe-. We had this discussion recently with some panthor internal memory stats.

Regards,

Tvrtko

  	drm_printf(p, "\n");
  	drm_printf(p, "- Exec queue ban list -\n");
  	spin_lock(&client->blame_lock);
diff --git a/drivers/gpu/drm/xe/xe_drm_client.h b/drivers/gpu/drm/xe/xe_drm_client.h
index d21fd0b90742..c35de675ccfa 100644
--- a/drivers/gpu/drm/xe/xe_drm_client.h
+++ b/drivers/gpu/drm/xe/xe_drm_client.h
@@ -53,6 +53,8 @@ struct xe_drm_client {
  	 * Protected by @blame_lock;
  	 */
  	struct list_head blame_list;
+	/** @reset_count: number of times this drm client has seen an engine reset */
+	atomic_t reset_count;
  #endif
  };
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index d9da5c89429e..8810abc8f04a 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -1988,7 +1988,9 @@ int xe_guc_exec_queue_reset_handler(struct xe_guc *guc, u32 *msg, u32 len)
  		return -EPROTO;
hwe = q->hwe;
-
+#ifdef CONFIG_PROC_FS
+	atomic_inc(&q->xef->client->reset_count);
+#endif
  	xe_gt_info(gt, "Engine reset: engine_class=%s, logical_mask: 0x%x, guc_id=%d",
  		   xe_hw_engine_class_to_str(q->class), q->logical_mask, guc_id);




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux