Re: [PATCH 06/24] drm/amdkfd: add trace_id return

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[AMD Official Use Only - General]



On 2023-11-22 17:21, Felix Kuehling wrote:

On 2023-11-03 09:11, James Zhu wrote:
Add trace_id return for new pc sampling creation per device,
Use IDR to quickly locate pc_sampling_entry for reference.

Signed-off-by: James Zhu <James.Zhu@xxxxxxx>
---
  drivers/gpu/drm/amd/amdkfd/kfd_device.c      |  2 ++
  drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 20 +++++++++++++++++++-
  drivers/gpu/drm/amd/amdkfd/kfd_priv.h        |  6 ++++++
  3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 0e24e011f66b..bcaeedac8fe0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -536,10 +536,12 @@ static void kfd_smi_init(struct kfd_node *dev)
  static void kfd_pc_sampling_init(struct kfd_node *dev)
  {
      mutex_init(&dev->pcs_data.mutex);
+    idr_init_base(&dev->pcs_data.hosttrap_entry.base.pc_sampling_idr, 1);
  }
    static void kfd_pc_sampling_exit(struct kfd_node *dev)
  {
+    idr_destroy(&dev->pcs_data.hosttrap_entry.base.pc_sampling_idr);
      mutex_destroy(&dev->pcs_data.mutex);
  }
  diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
index f0d910ee730c..4c9fc48e1a6a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
@@ -99,6 +99,7 @@ static int kfd_pc_sample_create(struct kfd_process_device *pdd,
  {
      struct kfd_pc_sample_info *supported_format = NULL;
      struct kfd_pc_sample_info user_info;
+    struct pc_sampling_entry *pcs_entry;
      int ret;
      int i;
  @@ -140,7 +141,19 @@ static int kfd_pc_sample_create(struct kfd_process_device *pdd,
          return ret ? ret : -EEXIST;
      }
  -    /* TODO: add trace_id return */
+    pcs_entry = kvzalloc(sizeof(*pcs_entry), GFP_KERNEL);
+    if (!pcs_entry) {
+        mutex_unlock(&pdd->dev->pcs_data.mutex);
+        return -ENOMEM;
+    }
+
+    i = idr_alloc_cyclic(&pdd->dev->pcs_data.hosttrap_entry.base.pc_sampling_idr,
+                pcs_entry, 1, 0, GFP_KERNEL);
+    if (i < 0) {
+        mutex_unlock(&pdd->dev->pcs_data.mutex);
+        kvfree(pcs_entry);
+        return i;
+    }
        if (!pdd->dev->pcs_data.hosttrap_entry.base.use_count)
          memcpy(&pdd->dev->pcs_data.hosttrap_entry.base.pc_sample_info,
@@ -149,6 +162,11 @@ static int kfd_pc_sample_create(struct kfd_process_device *pdd,
      pdd->dev->pcs_data.hosttrap_entry.base.use_count++;
      mutex_unlock(&pdd->dev->pcs_data.mutex);
  +    pcs_entry->pdd = pdd;

One more thought: You have a bunch of pcs_entries pointing to pdd. What mechanism guarantees, that the pcs_entries are destroyed before the pdd on process termination? I think kfd_pc_sampling_exit doesn't run during process termination, because it deals with per-device data structures, not per-process data structures.

Should the IDR be in the PDD rather than the device? In that case you wouldn't even need the pdd pointer in struct pcs_entry.
[JZ] the IDR here is mainly for generating trace_id with this device. I am not sure if ROCr/ROCprofiler are fine with this change which means same process has same trace_id value for different nodes. @Yat Sin, David would you mind give your comments here?

I see you have a patch much later in the series "drm/amdkfd: add pc sampling release when process release". I'd prefer this squashed here and in the patches that add the start/stop functions.

Regards,
  Felix


+    user_args->trace_id = (uint32_t)i;
+
+    pr_debug("alloc pcs_entry = %p, trace_id = 0x%x on gpu 0x%x", pcs_entry, i, pdd->dev->id);
+
      return 0;
  }
  diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 81c925fb2952..642558026d16 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -258,6 +258,7 @@ struct kfd_dev;
    struct kfd_dev_pc_sampling_data {
      uint32_t use_count;         /* Num of PC sampling sessions */
+    struct idr pc_sampling_idr;
      struct kfd_pc_sample_info pc_sample_info;
  };
  @@ -743,6 +744,11 @@ enum kfd_pdd_bound {
   */
  #define SDMA_ACTIVITY_DIVISOR  100
  +struct pc_sampling_entry {
+    bool enabled;
+    struct kfd_process_device *pdd;
+};
+
  /* Data that is per-process-per device. */
  struct kfd_process_device {
      /* The device that owns this data. */

[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux