Re: [PATCH 1/5] drm/amdgpu/gfx: add ring helpers for setting workload profile

Alex Deucher <alexdeucher@xxxxxxxxx> · Mon, 13 Jan 2025 11:54:11 -0500

On Fri, Jan 10, 2025 at 10:40 PM Lazar, Lijo <lijo.lazar@xxxxxxx> wrote:
>
>
>
> On 1/10/2025 8:33 PM, Alex Deucher wrote:
> > On Thu, Jan 9, 2025 at 10:30 PM Lazar, Lijo <lijo.lazar@xxxxxxx> wrote:
> >>
> >>
> >>
> >> On 1/9/2025 10:36 PM, Alex Deucher wrote:
> >>> On Thu, Jan 9, 2025 at 12:59 AM Lazar, Lijo <lijo.lazar@xxxxxxx> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 1/9/2025 4:26 AM, Alex Deucher wrote:
> >>>>> Add helpers to switch the workload profile dynamically when
> >>>>> commands are submitted.  This allows us to switch to
> >>>>> the FULLSCREEN3D or COMPUTE profile when work is submitted.
> >>>>> Add a delayed work handler to delay switching out of the
> >>>>> selected profile if additional work comes in.  This works
> >>>>> the same as the VIDEO profile for VCN.  This lets dynamically
> >>>>> enable workload profiles on the fly and then move back
> >>>>> to the default when there is no work.
> >>>>>
> >>>>> Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
> >>>>> ---
> >>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 57 +++++++++++++++++++++++++
> >>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 11 +++++
> >>>>>  2 files changed, 68 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> >>>>> index 6d5d81f0dc4e7..c542617121393 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> >>>>> @@ -2110,6 +2110,63 @@ void amdgpu_gfx_enforce_isolation_ring_end_use(struct amdgpu_ring *ring)
> >>>>>       mutex_unlock(&adev->enforce_isolation_mutex);
> >>>>>  }
> >>>>>
> >>>>> +void amdgpu_gfx_profile_idle_work_handler(struct work_struct *work)
> >>>>> +{
> >>>>> +     struct amdgpu_device *adev =
> >>>>> +             container_of(work, struct amdgpu_device, gfx.idle_work.work);
> >>>>> +     enum PP_SMC_POWER_PROFILE profile;
> >>>>> +     u32 i, fences = 0;
> >>>>> +     int r;
> >>>>> +
> >>>>> +     if (adev->gfx.num_gfx_rings)
> >>>>> +             profile = PP_SMC_POWER_PROFILE_FULLSCREEN3D;
> >>>>> +     else
> >>>>> +             profile = PP_SMC_POWER_PROFILE_COMPUTE;
> >>>>
> >>>> Since profile selection is in generic code, it makes sense to first
> >>>> check if the profile is supported for the family. Otherwise, this needs
> >>>> to be passed by the respective GFX family.
> >>>
> >>> The generic code already handles this.  If you select an unsupported
> >>> profile, it's ignored when the mask is updated.
> >>>
> >>
> >> That is strange. Does that mean user never gets an error if user
> >> attempts to set an unsupported profile?
> >
> > If you use sysfs, you can only select from the available options
> > supported by the chip so there is no way to select a non-supported
> > profile.  For the internal driver API, we just silently ignore it.
> >
> >>
> >> Another problem is this could override the user set profile now. Is that
> >> intended? In the current logic, whenever user sets a profile, the
> >> current profile is removed. With this one, another profile gets added
> >> and the user preference could be ignored depending on the priority.
> >
> > Yes, I think.  For VCN we already select the video profile in a
> > similar manner and for ROCm we already select the compute profile,
> > this just extends that to gfx.  This doesn't really change the
> > behavior compared to the current state of the driver.  At the moment
> > we default to fullscreen3d on navi chips and on MI chips we always
> > enable compute when ROCm is active.  The change here is that we
> > eventually fall back to the bootup profile by default when the GPU is
> > idle.  This allows PMFW to enable additional power saving features
> > while still providing a boost when applications are running.
> >
>
> Sounds good. Only concern is if user intentionally wants to use power
> saving profile all the time. Not sure if 3D has a lower priority than that.

Yes, power save is higher priority than fullscreen 3d.

Alex

>
> That aside, series is -
>         Reviewed-by: Lijo Lazar <lijo.lazar@xxxxxxx>
>
> Thanks,
> Lijo
>
> > Alex
> >
> >>
> >> Thanks,
> >> Lijo
> >>
> >>> Alex
> >>>
> >>>>
> >>>> Thanks,
> >>>> Lijo
> >>>>
> >>>>> +
> >>>>> +     for (i = 0; i < AMDGPU_MAX_GFX_RINGS; ++i)
> >>>>> +             fences += amdgpu_fence_count_emitted(&adev->gfx.gfx_ring[i]);
> >>>>> +     for (i = 0; i < (AMDGPU_MAX_COMPUTE_RINGS * AMDGPU_MAX_GC_INSTANCES); ++i)
> >>>>> +             fences += amdgpu_fence_count_emitted(&adev->gfx.compute_ring[i]);
> >>>>> +     if (!fences && !atomic_read(&adev->gfx.total_submission_cnt)) {
> >>>>> +             r = amdgpu_dpm_switch_power_profile(adev, profile, false);
> >>>>> +             if (r)
> >>>>> +                     dev_warn(adev->dev, "(%d) failed to disable %s power profile mode\n", r,
> >>>>> +                              profile == PP_SMC_POWER_PROFILE_FULLSCREEN3D ?
> >>>>> +                              "fullscreen 3D" : "compute");
> >>>>> +     } else {
> >>>>> +             schedule_delayed_work(&adev->gfx.idle_work, GFX_PROFILE_IDLE_TIMEOUT);
> >>>>> +     }
> >>>>> +}
> >>>>> +
> >>>>> +void amdgpu_gfx_profile_ring_begin_use(struct amdgpu_ring *ring)
> >>>>> +{
> >>>>> +     struct amdgpu_device *adev = ring->adev;
> >>>>> +     enum PP_SMC_POWER_PROFILE profile;
> >>>>> +     int r;
> >>>>> +
> >>>>> +     if (adev->gfx.num_gfx_rings)
> >>>>> +             profile = PP_SMC_POWER_PROFILE_FULLSCREEN3D;
> >>>>> +     else
> >>>>> +             profile = PP_SMC_POWER_PROFILE_COMPUTE;
> >>>>> +
> >>>>> +     atomic_inc(&adev->gfx.total_submission_cnt);
> >>>>> +
> >>>>> +     if (!cancel_delayed_work_sync(&adev->gfx.idle_work)) {
> >>>>> +             r = amdgpu_dpm_switch_power_profile(adev, profile, true);
> >>>>> +             if (r)
> >>>>> +                     dev_warn(adev->dev, "(%d) failed to disable %s power profile mode\n", r,
> >>>>> +                              profile == PP_SMC_POWER_PROFILE_FULLSCREEN3D ?
> >>>>> +                              "fullscreen 3D" : "compute");
> >>>>> +     }
> >>>>> +}
> >>>>> +
> >>>>> +void amdgpu_gfx_profile_ring_end_use(struct amdgpu_ring *ring)
> >>>>> +{
> >>>>> +     atomic_dec(&ring->adev->gfx.total_submission_cnt);
> >>>>> +
> >>>>> +     schedule_delayed_work(&ring->adev->gfx.idle_work, GFX_PROFILE_IDLE_TIMEOUT);
> >>>>> +}
> >>>>> +
> >>>>>  /*
> >>>>>   * debugfs for to enable/disable gfx job submission to specific core.
> >>>>>   */
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> >>>>> index 7f9e261f47f11..6c84598caec21 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> >>>>> @@ -57,6 +57,9 @@ enum amdgpu_gfx_pipe_priority {
> >>>>>  #define AMDGPU_GFX_QUEUE_PRIORITY_MINIMUM  0
> >>>>>  #define AMDGPU_GFX_QUEUE_PRIORITY_MAXIMUM  15
> >>>>>
> >>>>> +/* 1 second timeout */
> >>>>> +#define GFX_PROFILE_IDLE_TIMEOUT     msecs_to_jiffies(1000)
> >>>>> +
> >>>>>  enum amdgpu_gfx_partition {
> >>>>>       AMDGPU_SPX_PARTITION_MODE = 0,
> >>>>>       AMDGPU_DPX_PARTITION_MODE = 1,
> >>>>> @@ -477,6 +480,9 @@ struct amdgpu_gfx {
> >>>>>       bool                            kfd_sch_inactive[MAX_XCP];
> >>>>>       unsigned long                   enforce_isolation_jiffies[MAX_XCP];
> >>>>>       unsigned long                   enforce_isolation_time[MAX_XCP];
> >>>>> +
> >>>>> +     atomic_t                        total_submission_cnt;
> >>>>> +     struct delayed_work             idle_work;
> >>>>>  };
> >>>>>
> >>>>>  struct amdgpu_gfx_ras_reg_entry {
> >>>>> @@ -585,6 +591,11 @@ void amdgpu_gfx_cleaner_shader_init(struct amdgpu_device *adev,
> >>>>>  void amdgpu_gfx_enforce_isolation_handler(struct work_struct *work);
> >>>>>  void amdgpu_gfx_enforce_isolation_ring_begin_use(struct amdgpu_ring *ring);
> >>>>>  void amdgpu_gfx_enforce_isolation_ring_end_use(struct amdgpu_ring *ring);
> >>>>> +
> >>>>> +void amdgpu_gfx_profile_idle_work_handler(struct work_struct *work);
> >>>>> +void amdgpu_gfx_profile_ring_begin_use(struct amdgpu_ring *ring);
> >>>>> +void amdgpu_gfx_profile_ring_end_use(struct amdgpu_ring *ring);
> >>>>> +
> >>>>>  void amdgpu_debugfs_gfx_sched_mask_init(struct amdgpu_device *adev);
> >>>>>  void amdgpu_debugfs_compute_sched_mask_init(struct amdgpu_device *adev);
> >>>>>
> >>>>
> >>
>