On 2023-11-10 14:07, Yat Sin, David
wrote:
[JZ] pdd->dev->pcs_data.mutex is per device, not per process. In the future, also it will share protection within different pc sampling methods on the same devices. So I don't think a bigger lock here is good idea.[AMD Official Use Only - General]-----Original Message----- From: Zhu, James <James.Zhu@xxxxxxx> Sent: Friday, November 3, 2023 9:12 AM To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx Cc: Kuehling, Felix <Felix.Kuehling@xxxxxxx>; Greathouse, Joseph <Joseph.Greathouse@xxxxxxx>; Yat Sin, David <David.YatSin@xxxxxxx>; Zhu, James <James.Zhu@xxxxxxx> Subject: [PATCH 19/24] drm/amdkfd: enable pc sampling stop Enable pc sampling stop. Signed-off-by: James Zhu <James.Zhu@xxxxxxx> --- drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 28 +++++++++++++++++-- - drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 ++ 2 files changed, 27 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c index 33d003ca0093..2c4ac5b4cc4b 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c @@ -108,10 +108,32 @@ static int kfd_pc_sample_start(struct kfd_process_device *pdd, return 0; } -static int kfd_pc_sample_stop(struct kfd_process_device *pdd) +static int kfd_pc_sample_stop(struct kfd_process_device *pdd, + struct pc_sampling_entry *pcs_entry) { - return -EINVAL; + bool pc_sampling_stop = false; + + pcs_entry->enabled = false; + mutex_lock(&pdd->dev->pcs_data.mutex);For the START/STOP/DESTROY ioctls, I think we can hold pdd->dev->pcs_data.mutex through the whole IOCTL. Then we would not have to deal with the intermediate states where the START/STOP/DESTROY are happening at the same time.
+ pdd->dev->pcs_data.hosttrap_entry.base.active_count--; + if (!pdd->dev->pcs_data.hosttrap_entry.base.active_count) { + WRITE_ONCE(pdd->dev-pcs_data.hosttrap_entry.base.stop_enable, true);+ pc_sampling_stop = true; + } + mutex_unlock(&pdd->dev->pcs_data.mutex); + if (pc_sampling_stop) { + kfd_process_set_trap_pc_sampling_flag(&pdd->qpd, + pdd->dev-pcs_data.hosttrap_entry.base.pc_sample_info.method,+false); + + mutex_lock(&pdd->dev->pcs_data.mutex); + pdd->dev->pcs_data.hosttrap_entry.base.target_simd = 0; + pdd->dev->pcs_data.hosttrap_entry.base.target_wave_slot = 0; + WRITE_ONCE(pdd->dev-pcs_data.hosttrap_entry.base.stop_enable, false);+ mutex_unlock(&pdd->dev->pcs_data.mutex); + } + + return 0; } static int kfd_pc_sample_create(struct kfd_process_device *pdd, @@ -251,7 +273,7 @@ int kfd_pc_sample(struct kfd_process_device *pdd, if (!pcs_entry->enabled) return -EALREADY; else - return kfd_pc_sample_stop(pdd); + return kfd_pc_sample_stop(pdd, pcs_entry); } return -EINVAL; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 613910e0d440..badaa4d68cc4 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -259,6 +259,8 @@ struct kfd_dev; struct kfd_dev_pc_sampling_data { uint32_t use_count; /* Num of PC sampling sessions */ uint32_t active_count; /* Num of active sessions */ + uint32_t target_simd; /* target simd for trap */ + uint32_t target_wave_slot; /* target wave slot for trap */ bool stop_enable; /* pc sampling stop in process */ struct idr pc_sampling_idr; struct kfd_pc_sample_info pc_sample_info; -- 2.25.1