Re: [PATCH 2/4] drm/amdkfd: Use asic specifc function to set CP wait time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2025-02-18 15:27, Kim, Jonathan wrote:
> [Public]
> 
>> -----Original Message-----
>> From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Harish
>> Kasiviswanathan
>> Sent: Wednesday, February 12, 2025 5:04 PM
>> To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx
>> Cc: Kasiviswanathan, Harish <Harish.Kasiviswanathan@xxxxxxx>
>> Subject: [PATCH 2/4] drm/amdkfd: Use asic specifc function to set CP wait time
>>
>> Currently, grace period (SCH_WAVE) is set only for gfx943 APU. This
>> could change as other wait times also needs to be set. Move ASIC
>> specific settings to ASIC specific function.
>>
>> Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@xxxxxxx>
>> ---
>>  .../drm/amd/amdkfd/kfd_device_queue_manager.c | 28 ++++---------------
>>  .../drm/amd/amdkfd/kfd_device_queue_manager.h |  9 ++++++
>>  .../gpu/drm/amd/amdkfd/kfd_packet_manager.c   |  8 ++++++
>>  .../drm/amd/amdkfd/kfd_packet_manager_v9.c    | 10 +++++++
>>  4 files changed, 32 insertions(+), 23 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> index b88a95b5ae0d..3eaa44d0410d 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> @@ -1760,10 +1760,7 @@ static int initialize_cpsch(struct device_queue_manager
>> *dqm)
>>
>>       init_sdma_bitmaps(dqm);
>>
>> -     if (dqm->dev->kfd2kgd->get_iq_wait_times)
>> -             dqm->dev->kfd2kgd->get_iq_wait_times(dqm->dev->adev,
>> -                                     &dqm->wait_times,
>> -                                     ffs(dqm->dev->xcc_mask) - 1);
>> +     update_dqm_wait_times(dqm);
>>       return 0;
>>  }
>>
>> @@ -1859,27 +1856,12 @@ static int start_cpsch(struct device_queue_manager
>> *dqm)
>>       /* clear hang status when driver try to start the hw scheduler */
>>       dqm->sched_running = true;
>>
>> -     if (!dqm->dev->kfd->shared_resources.enable_mes)
>> +     if (!dqm->dev->kfd->shared_resources.enable_mes) {
>> +             if(pm_set_compute_queue_wait_counts(&dqm->packet_mgr,
>> +                                         KFD_INIT_CP_QUEUE_WAIT_TIMES))
>> +                     dev_err(dev, "Failed to configure CP wait times\n");
>>               execute_queues_cpsch(dqm,
>> KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0,
>> KFD_SET_DEFAULT_CP_QUEUE_WAIT_TIMES);
>> -
>> -     /* Set CWSR grace period to 1x1000 cycle for GFX9.4.3 APU */
>> -     if (amdgpu_emu_mode == 0 && dqm->dev->adev->gmc.is_app_apu &&
>> -         (KFD_GC_VERSION(dqm->dev) == IP_VERSION(9, 4, 3))) {
>> -             uint32_t reg_offset = 0;
>> -             uint32_t grace_period = 1;
>> -
>> -             retval = pm_set_compute_queue_wait_counts(&dqm->packet_mgr,
>> -                                             grace_period);
>> -             if (retval)
>> -                     dev_err(dev, "Setting grace timeout failed\n");
>> -             else if (dqm->dev->kfd2kgd->build_grace_period_packet_info)
>> -                     /* Update dqm->wait_times maintained in software */
>> -                     dqm->dev->kfd2kgd->build_grace_period_packet_info(
>> -                                     dqm->dev->adev, dqm->wait_times,
>> -                                     grace_period, &reg_offset,
>> -                                     &dqm->wait_times);
>>       }
>> -
> 
> Btw, is_app_apu is also tied to GFX 9.4.4.  Just double checking, we only need to shorten the dequeue wait for GFX 9.4.3 right?
> Also, I'm not sure if I like the idea of another macro to define a new grace period.  It looks like we could front load this somehow.
> How about this?
> - in kfd_device_queue_manager.h, under  struct device_queue_manger, declare member uint32_t dequeue_wait_override.
> - in kfd_device_queue_manager.c under device_queue_manager_init, do the IP check there and set dequeue_wait_override either to 1 or the default macro based on check value.  This way, the override is set during KFD node init.
> - then in start_cpsh, you can do -> if (dqm->dequeue_wait_override != DEFAULT_MACRO) pm_set_etc_etc...
> That seems clearer that the default "override" is HW IP-device based rather than something packet based.
> 
> Jon

Thanks Jon. I too agree that these macros and even over-loading variable as value & macro is ugly.
I will send a new set of patches as following. Please let me know your suggestions.

- modify pm_update_grace_period() to new name but add additional parameter. Somthing like, pm_config_dequeue_wait(pm, enum dequeue_wait_command cmd, uint value)
  -- This way this API is independent unlike before where USE_DEFAULT_GRACE_PERIOD was used in multiple places.
  -- enum dequeue_wait_command will have three options now: 
   --- DEQUEUE_WAIT_INIT  , value (unused) : Initialize which will be called once. Compute asic optimized value. Set it and it will be stored in dqm->wait_times for future use.
   --- DEQUEUE_SET_SCH_WAVE, value: The value will be used to set sch_wave
   --- DEQUEUE_RESET, value (unused): Sets the value back to default dqm->wait_times

start_cpsch() --> will call with  DEQUEUE_WAIT_INIT
unmap_cpsch() --> will call with either (DEQUEUE_SET_SCH_WAVE, value) or (DEQUEUE_RESET)


For comments on other patches. grace_period will be changed to sch_wave to be consistent. 



> 
>>       /* setup per-queue reset detection buffer  */
>>       num_hw_queue_slots =  dqm->dev->kfd-
>>> shared_resources.num_queue_per_pipe *
>>                             dqm->dev->kfd->shared_resources.num_pipe_per_mec *
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
>> index 273c04a95568..8a7d9b2a135c 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
>> @@ -38,6 +38,7 @@
>>  #define KFD_MES_PROCESS_QUANTUM              100000
>>  #define KFD_MES_GANG_QUANTUM         10000
>>  #define KFD_SET_DEFAULT_CP_QUEUE_WAIT_TIMES 0xffffffff
>> +#define KFD_INIT_CP_QUEUE_WAIT_TIMES     0xfffffffe
>>
>>  struct device_process_node {
>>       struct qcm_process_device *qpd;
>> @@ -359,4 +360,12 @@ static inline int read_sdma_queue_counter(uint64_t __user
>> *q_rptr, uint64_t *val
>>       /* SDMA activity counter is stored at queue's RPTR + 0x8 location. */
>>       return get_user(*val, q_rptr + 1);
>>  }
>> +
>> +static inline void update_dqm_wait_times(struct device_queue_manager *dqm)
>> +{
>> +     if (dqm->dev->kfd2kgd->get_iq_wait_times)
>> +             dqm->dev->kfd2kgd->get_iq_wait_times(dqm->dev->adev,
>> +                                     &dqm->wait_times,
>> +                                     ffs(dqm->dev->xcc_mask) - 1);
>> +}
>>  #endif /* KFD_DEVICE_QUEUE_MANAGER_H_ */
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> index 8d2f63a38724..e94ed478bf5e 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> @@ -402,6 +402,8 @@ int pm_send_query_status(struct packet_manager *pm,
>> uint64_t fence_address,
>>   *  @wait_counts_config: Parameter overridden. Could be flag or grace_period
>>   *   Possible flag values:
>>   *     KFD_SET_DEFAULT_CP_QUEUE_WAIT_TIMES then reset to default value
>> + *     KFD_INIT_CP_QUEUE_WAIT_TIMES Initialize compute queue wait times with
>> + *      best values for each ASIC.
>>   *
>>   *   If not an above flag, Wait Count for Scheduling Wave Message (SCH_WAVE)
>>   *      is set to wait_counts_config value.
>> @@ -415,6 +417,10 @@ int pm_set_compute_queue_wait_counts(struct
>> packet_manager *pm, uint32_t wait_co
>>       int retval = 0;
>>       uint32_t *buffer, size;
>>
>> +     if (!pm->pmf->set_compute_queue_wait_counts ||
>> +         !pm->pmf->set_compute_queue_wait_counts_size)
>> +             return 0;
>> +
>>       size = pm->pmf->set_compute_queue_wait_counts_size;
>>
>>       mutex_lock(&pm->lock);
>> @@ -440,6 +446,8 @@ int pm_set_compute_queue_wait_counts(struct
>> packet_manager *pm, uint32_t wait_co
>>
>>  out:
>>       mutex_unlock(&pm->lock);
>> +     /* Update dqm->wait_times maintained in software */
>> +     update_dqm_wait_times(pm->dqm);
>>       return retval;
>>  }
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
>> index 8b693a9446e8..6a5ddadec936 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
>> @@ -302,9 +302,19 @@ static int pm_set_compute_queue_wait_counts_v9(struct
>> packet_manager *pm,
>>               uint32_t wait_counts_config)
>>  {
>>       struct pm4_mec_write_data_mmio *packet;
>> +     struct device_queue_manager *dqm = pm->dqm;
>>       uint32_t reg_offset = 0;
>>       uint32_t reg_data = 0;
>>
>> +     if (wait_counts_config == KFD_INIT_CP_QUEUE_WAIT_TIMES) {
>> +                /* Set CWSR grace period to 1x1000 cycle for GFX9.4.3 APU */
>> +                if (amdgpu_emu_mode == 0 && dqm->dev->adev->gmc.is_app_apu &&
>> +                    KFD_GC_VERSION(dqm->dev) == IP_VERSION(9, 4, 3))
>> +                        wait_counts_config = 1;
>> +                else
>> +                        return 0;
>> +        }
>> +
>>       pm->dqm->dev->kfd2kgd->build_grace_period_packet_info(
>>                       pm->dqm->dev->adev,
>>                       pm->dqm->wait_times,
>> --
>> 2.34.1
> 



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux