RE: [PATCH v2] drm/amdkfd: Update return value of config_dequeue_wait_counts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[Public]

-----Original Message-----
From: Kim, Jonathan <Jonathan.Kim@xxxxxxx>
Sent: Friday, March 14, 2025 5:35 PM
To: Kasiviswanathan, Harish <Harish.Kasiviswanathan@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Subject: RE: [PATCH v2] drm/amdkfd: Update return value of config_dequeue_wait_counts

[Public]

> -----Original Message-----
> From: Kasiviswanathan, Harish <Harish.Kasiviswanathan@xxxxxxx>
> Sent: Friday, March 14, 2025 5:04 PM
> To: Kim, Jonathan <Jonathan.Kim@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
> Subject: RE: [PATCH v2] drm/amdkfd: Update return value of
> config_dequeue_wait_counts
>
> [Public]
>
> -----Original Message-----
> From: Kim, Jonathan <Jonathan.Kim@xxxxxxx>
> Sent: Friday, March 14, 2025 4:41 PM
> To: Kasiviswanathan, Harish <Harish.Kasiviswanathan@xxxxxxx>; amd-
> gfx@xxxxxxxxxxxxxxxxxxxxx
> Cc: Kasiviswanathan, Harish <Harish.Kasiviswanathan@xxxxxxx>
> Subject: RE: [PATCH v2] drm/amdkfd: Update return value of
> config_dequeue_wait_counts
>
> [Public]
>
> > -----Original Message-----
> > From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Harish
> > Kasiviswanathan
> > Sent: Friday, March 14, 2025 4:22 PM
> > To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx
> > Cc: Kasiviswanathan, Harish <Harish.Kasiviswanathan@xxxxxxx>
> > Subject: [PATCH v2] drm/amdkfd: Update return value of
> > config_dequeue_wait_counts
> >
> > .config_dequeue_wait_counts returns a nop case. Modify return parameter
> > to reflect that since the caller also needs to ignore this condition.
> >
> > v2: Removed redudant code.
> >     Tidy up code based on review comments
> >
> > Fixes: <98a5af8103f> ("drm/amdkfd: Add pm_config_dequeue_wait_counts API")
> >
> > Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@xxxxxxx>
> > ---
> >  .../gpu/drm/amd/amdkfd/kfd_packet_manager.c   | 14 ++++----
> >  .../drm/amd/amdkfd/kfd_packet_manager_v9.c    | 36 +++++++++++--------
> >  2 files changed, 29 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> > b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> > index 3f574d82b5fc..502b89639a9f 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> > @@ -436,19 +436,19 @@ int pm_config_dequeue_wait_counts(struct
> > packet_manager *pm,
> >
> >               retval = pm->pmf->config_dequeue_wait_counts(pm, buffer,
> >                                                            cmd, value);
> > -             if (!retval)
> > +             if (retval > 0) {
> >                       retval = kq_submit_packet(pm->priv_queue);
> > +
> > +                     /* If default value is modified, cache that in dqm->wait_times
> > */
> > +                     if (!retval && cmd == KFD_DEQUEUE_WAIT_INIT)
> > +                             update_dqm_wait_times(pm->dqm);
> > +             }
> >               else
> >                       kq_rollback_packet(pm->priv_queue);
> >       }
> > -
> > -     /* If default value is modified, cache that value in dqm->wait_times */
> > -     if (!retval && cmd == KFD_DEQUEUE_WAIT_INIT)
> > -             update_dqm_wait_times(pm->dqm);
> > -
> >  out:
> >       mutex_unlock(&pm->lock);
> > -     return retval;
> > +     return retval < 0 ? retval : 0;
> >  }
> >
> >  int pm_send_unmap_queue(struct packet_manager *pm,
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
> > b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
> > index d440df602393..3c6134d61b2b 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
> > @@ -310,6 +310,13 @@ static inline void
> > pm_build_dequeue_wait_counts_packet_info(struct packet_manage
> >               reg_data);
> >  }
> >
> > +/* pm_config_dequeue_wait_counts_v9: Builds WRITE_DATA packet with
> > + *    register/value for configuring dequeue wait counts
> > + *
> > + * @return: -ve for failure, 0 for nop and +ve for success and buffer is
> > + *  filled in with packet
> > + *
> > + **/
> >  static int pm_config_dequeue_wait_counts_v9(struct packet_manager *pm,
> >               uint32_t *buffer,
> >               enum kfd_config_dequeue_wait_counts_cmd cmd,
> > @@ -321,24 +328,25 @@ static int pm_config_dequeue_wait_counts_v9(struct
> > packet_manager *pm,
> >
> >       switch (cmd) {
> >       case KFD_DEQUEUE_WAIT_INIT: {
> > -             uint32_t sch_wave = 0, que_sleep = 0;
> > -             /* Reduce CP_IQ_WAIT_TIME2.QUE_SLEEP to 0x1 from default
> > 0x40.
> > +             uint32_t sch_wave = 0, que_sleep = 1;
> > +
> > +             if (KFD_GC_VERSION(pm->dqm->dev) < IP_VERSION(9, 4, 1) ||
> > +                 KFD_GC_VERSION(pm->dqm->dev) > IP_VERSION(10, 0, 0))
> > +                     return 0;
>
> From my last comment, I suggested to put this at the beginning of the non-v9
> pm_config_dequeue_wait_counts call that calls this function.
> Then you don't have to make the return value more complicated than it currently is.
>
> [HK]: Ah ok. I didn't really want to put asic specific code in there but in this case
> code it is fine as you mentioned we have already overloading these functions.

Right.  Which is why I also suggested that you could create a front loaded flag or mask if you didn't like this idea.

e.g. of a mask:
declare dqm->wait_times_override_mask in the kfd_device_queue_manager struct.

Do some defines in a header somewhere:
#define KFD_DEQUEUE_WAIT_SCH_WAVE_OVERRIDE_FLAG 0x1
#define KFD_DEQUEUE_WAIT_QUE_SLEEP_OVERRIDE_FLAG 0x2

Then in initialize_cpsh:
if (KFD_GC_VERSION(pm->dqm->dev) >= IP_VERSION(9, 4, 1) &&
    KFD_GC_VERSION(pm->dqm->dev) < IP_VERSION(10, 0, 0)) {
        dqm->wait_times_override_mask |= KFD_DEQUEUE_WAIT_QUE_SLEEP_OVERRIDE_FLAG
if (amdgpu_emu_mode == 0 && pm->dqm->dev->adev->gmc.is_app_apu &&
    (KFD_GC_VERSION(pm->dqm->dev) == IP_VERSION(9, 4, 3)))
        dqm->wait_times_override_mask |= KFD_DEQUEUE_WAIT_SCH_WAVE_OVERRIDE_FLAG

Then at the beginning of pm_config_dequeue_wait_counts:
if (cmd == KFD_DEQUEUE_WAIT_INIT && !dqm->wait_times_override_mask)
   return 0;

And pm_config_dequeue_wait_counts_v9 gets simplified to
case KFD_DEQUEUE_WAIT_INIT:
   uint32_t que_sleep = dqm->wait_times_override_mask & KFD_DEQUEUE_WAIT_QUE_SLEEP_OVERRIDE_FLAG ? 1 : 0;
   uint32_t sch_wave = dqm->wait_times_override_mask & KFD_DEQUEUE_WAIT_SCH_OVERRIDE_FLAG ? 1 : 0;

   if (!(que_sleep || sch_wave))
        return -EINVAL;  // for safety

   <etc .. etc..>


Otherwise, splitting the IP check is a quick and dirty fix.

[HK]: Going with this one for now. Can revisit again if more modifications are needed.

Jon

>
> Also KFD_GC_VERSION(pm->dqm->dev) > IP_VERSION(10, 0, 0) is incorrect and
> should be >= because want to also exclude anything with a major version of 10.
> [HK]: good catch
>
> Jon
>
> > +
> > +             /* For all other gfx9 ASICs,
> > +              * Reduce CP_IQ_WAIT_TIME2.QUE_SLEEP to 0x1 from default
> > 0x40.
> >                * On a 1GHz machine this is roughly 1 microsecond, which is
> >                * about how long it takes to load data out of memory during
> >                * queue connect
> >                * QUE_SLEEP: Wait Count for Dequeue Retry.
> > +              *
> > +              * Set CWSR grace period to 1x1000 cycle for GFX9.4.3 APU
> >                */
> > -             if (KFD_GC_VERSION(pm->dqm->dev) >= IP_VERSION(9, 4, 1) &&
> > -                 KFD_GC_VERSION(pm->dqm->dev) < IP_VERSION(10, 0, 0)) {
> > -                     que_sleep = 1;
> > -
> > -                     /* Set CWSR grace period to 1x1000 cycle for GFX9.4.3 APU
> > */
> > -                     if (amdgpu_emu_mode == 0 && pm->dqm->dev->adev-
> > >gmc.is_app_apu &&
> > -                     (KFD_GC_VERSION(pm->dqm->dev) == IP_VERSION(9, 4,
> > 3)))
> > -                             sch_wave = 1;
> > -             } else {
> > -                     return 0;
> > -             }
> > +             if (amdgpu_emu_mode == 0 && pm->dqm->dev->adev-
> > >gmc.is_app_apu &&
> > +                 (KFD_GC_VERSION(pm->dqm->dev) == IP_VERSION(9, 4, 3)))
> > +                     sch_wave = 1;
> > +
> >               pm_build_dequeue_wait_counts_packet_info(pm, sch_wave,
> > que_sleep,
> >                       &reg_offset, &reg_data);
> >
> > @@ -377,7 +385,7 @@ static int pm_config_dequeue_wait_counts_v9(struct
> > packet_manager *pm,
> >
> >       packet->data = reg_data;
> >
> > -     return 0;
> > +     return sizeof(struct pm4_mec_write_data_mmio);
> >  }
> >
> >  static int pm_unmap_queues_v9(struct packet_manager *pm, uint32_t *buffer,
> > --
> > 2.34.1
>
>






[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux