[Public] > -----Original Message----- > From: Kasiviswanathan, Harish <Harish.Kasiviswanathan@xxxxxxx> > Sent: Friday, March 14, 2025 5:04 PM > To: Kim, Jonathan <Jonathan.Kim@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Subject: RE: [PATCH v2] drm/amdkfd: Update return value of > config_dequeue_wait_counts > > [Public] > > -----Original Message----- > From: Kim, Jonathan <Jonathan.Kim@xxxxxxx> > Sent: Friday, March 14, 2025 4:41 PM > To: Kasiviswanathan, Harish <Harish.Kasiviswanathan@xxxxxxx>; amd- > gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Kasiviswanathan, Harish <Harish.Kasiviswanathan@xxxxxxx> > Subject: RE: [PATCH v2] drm/amdkfd: Update return value of > config_dequeue_wait_counts > > [Public] > > > -----Original Message----- > > From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Harish > > Kasiviswanathan > > Sent: Friday, March 14, 2025 4:22 PM > > To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx > > Cc: Kasiviswanathan, Harish <Harish.Kasiviswanathan@xxxxxxx> > > Subject: [PATCH v2] drm/amdkfd: Update return value of > > config_dequeue_wait_counts > > > > .config_dequeue_wait_counts returns a nop case. Modify return parameter > > to reflect that since the caller also needs to ignore this condition. > > > > v2: Removed redudant code. > > Tidy up code based on review comments > > > > Fixes: <98a5af8103f> ("drm/amdkfd: Add pm_config_dequeue_wait_counts API") > > > > Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@xxxxxxx> > > --- > > .../gpu/drm/amd/amdkfd/kfd_packet_manager.c | 14 ++++---- > > .../drm/amd/amdkfd/kfd_packet_manager_v9.c | 36 +++++++++++-------- > > 2 files changed, 29 insertions(+), 21 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c > > b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c > > index 3f574d82b5fc..502b89639a9f 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c > > @@ -436,19 +436,19 @@ int pm_config_dequeue_wait_counts(struct > > packet_manager *pm, > > > > retval = pm->pmf->config_dequeue_wait_counts(pm, buffer, > > cmd, value); > > - if (!retval) > > + if (retval > 0) { > > retval = kq_submit_packet(pm->priv_queue); > > + > > + /* If default value is modified, cache that in dqm->wait_times > > */ > > + if (!retval && cmd == KFD_DEQUEUE_WAIT_INIT) > > + update_dqm_wait_times(pm->dqm); > > + } > > else > > kq_rollback_packet(pm->priv_queue); > > } > > - > > - /* If default value is modified, cache that value in dqm->wait_times */ > > - if (!retval && cmd == KFD_DEQUEUE_WAIT_INIT) > > - update_dqm_wait_times(pm->dqm); > > - > > out: > > mutex_unlock(&pm->lock); > > - return retval; > > + return retval < 0 ? retval : 0; > > } > > > > int pm_send_unmap_queue(struct packet_manager *pm, > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c > > b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c > > index d440df602393..3c6134d61b2b 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c > > @@ -310,6 +310,13 @@ static inline void > > pm_build_dequeue_wait_counts_packet_info(struct packet_manage > > reg_data); > > } > > > > +/* pm_config_dequeue_wait_counts_v9: Builds WRITE_DATA packet with > > + * register/value for configuring dequeue wait counts > > + * > > + * @return: -ve for failure, 0 for nop and +ve for success and buffer is > > + * filled in with packet > > + * > > + **/ > > static int pm_config_dequeue_wait_counts_v9(struct packet_manager *pm, > > uint32_t *buffer, > > enum kfd_config_dequeue_wait_counts_cmd cmd, > > @@ -321,24 +328,25 @@ static int pm_config_dequeue_wait_counts_v9(struct > > packet_manager *pm, > > > > switch (cmd) { > > case KFD_DEQUEUE_WAIT_INIT: { > > - uint32_t sch_wave = 0, que_sleep = 0; > > - /* Reduce CP_IQ_WAIT_TIME2.QUE_SLEEP to 0x1 from default > > 0x40. > > + uint32_t sch_wave = 0, que_sleep = 1; > > + > > + if (KFD_GC_VERSION(pm->dqm->dev) < IP_VERSION(9, 4, 1) || > > + KFD_GC_VERSION(pm->dqm->dev) > IP_VERSION(10, 0, 0)) > > + return 0; > > From my last comment, I suggested to put this at the beginning of the non-v9 > pm_config_dequeue_wait_counts call that calls this function. > Then you don't have to make the return value more complicated than it currently is. > > [HK]: Ah ok. I didn't really want to put asic specific code in there but in this case > code it is fine as you mentioned we have already overloading these functions. Right. Which is why I also suggested that you could create a front loaded flag or mask if you didn't like this idea. e.g. of a mask: declare dqm->wait_times_override_mask in the kfd_device_queue_manager struct. Do some defines in a header somewhere: #define KFD_DEQUEUE_WAIT_SCH_WAVE_OVERRIDE_FLAG 0x1 #define KFD_DEQUEUE_WAIT_QUE_SLEEP_OVERRIDE_FLAG 0x2 Then in initialize_cpsh: if (KFD_GC_VERSION(pm->dqm->dev) >= IP_VERSION(9, 4, 1) && KFD_GC_VERSION(pm->dqm->dev) < IP_VERSION(10, 0, 0)) { dqm->wait_times_override_mask |= KFD_DEQUEUE_WAIT_QUE_SLEEP_OVERRIDE_FLAG if (amdgpu_emu_mode == 0 && pm->dqm->dev->adev->gmc.is_app_apu && (KFD_GC_VERSION(pm->dqm->dev) == IP_VERSION(9, 4, 3))) dqm->wait_times_override_mask |= KFD_DEQUEUE_WAIT_SCH_WAVE_OVERRIDE_FLAG Then at the beginning of pm_config_dequeue_wait_counts: if (cmd == KFD_DEQUEUE_WAIT_INIT && !dqm->wait_times_override_mask) return 0; And pm_config_dequeue_wait_counts_v9 gets simplified to case KFD_DEQUEUE_WAIT_INIT: uint32_t que_sleep = dqm->wait_times_override_mask & KFD_DEQUEUE_WAIT_QUE_SLEEP_OVERRIDE_FLAG ? 1 : 0; uint32_t sch_wave = dqm->wait_times_override_mask & KFD_DEQUEUE_WAIT_SCH_OVERRIDE_FLAG ? 1 : 0; if (!(que_sleep || sch_wave)) return -EINVAL; // for safety <etc .. etc..> Otherwise, splitting the IP check is a quick and dirty fix. Jon > > Also KFD_GC_VERSION(pm->dqm->dev) > IP_VERSION(10, 0, 0) is incorrect and > should be >= because want to also exclude anything with a major version of 10. > [HK]: good catch > > Jon > > > + > > + /* For all other gfx9 ASICs, > > + * Reduce CP_IQ_WAIT_TIME2.QUE_SLEEP to 0x1 from default > > 0x40. > > * On a 1GHz machine this is roughly 1 microsecond, which is > > * about how long it takes to load data out of memory during > > * queue connect > > * QUE_SLEEP: Wait Count for Dequeue Retry. > > + * > > + * Set CWSR grace period to 1x1000 cycle for GFX9.4.3 APU > > */ > > - if (KFD_GC_VERSION(pm->dqm->dev) >= IP_VERSION(9, 4, 1) && > > - KFD_GC_VERSION(pm->dqm->dev) < IP_VERSION(10, 0, 0)) { > > - que_sleep = 1; > > - > > - /* Set CWSR grace period to 1x1000 cycle for GFX9.4.3 APU > > */ > > - if (amdgpu_emu_mode == 0 && pm->dqm->dev->adev- > > >gmc.is_app_apu && > > - (KFD_GC_VERSION(pm->dqm->dev) == IP_VERSION(9, 4, > > 3))) > > - sch_wave = 1; > > - } else { > > - return 0; > > - } > > + if (amdgpu_emu_mode == 0 && pm->dqm->dev->adev- > > >gmc.is_app_apu && > > + (KFD_GC_VERSION(pm->dqm->dev) == IP_VERSION(9, 4, 3))) > > + sch_wave = 1; > > + > > pm_build_dequeue_wait_counts_packet_info(pm, sch_wave, > > que_sleep, > > ®_offset, ®_data); > > > > @@ -377,7 +385,7 @@ static int pm_config_dequeue_wait_counts_v9(struct > > packet_manager *pm, > > > > packet->data = reg_data; > > > > - return 0; > > + return sizeof(struct pm4_mec_write_data_mmio); > > } > > > > static int pm_unmap_queues_v9(struct packet_manager *pm, uint32_t *buffer, > > -- > > 2.34.1 > >