Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

Alex Deucher <alexdeucher@xxxxxxxxx> · Wed, 28 Sep 2022 17:51:47 -0400

On Wed, Sep 28, 2022 at 4:57 AM Sharma, Shashank
<shashank.sharma@xxxxxxx> wrote:
>
>
>
> On 9/27/2022 10:40 PM, Alex Deucher wrote:
> > On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank
> > <shashank.sharma@xxxxxxx> wrote:
> >>
> >>
> >>
> >> On 9/27/2022 5:23 PM, Felix Kuehling wrote:
> >>> Am 2022-09-27 um 10:58 schrieb Sharma, Shashank:
> >>>> Hello Felix,
> >>>>
> >>>> Thank for the review comments.
> >>>>
> >>>> On 9/27/2022 4:48 PM, Felix Kuehling wrote:
> >>>>> Am 2022-09-27 um 02:12 schrieb Christian König:
> >>>>>> Am 26.09.22 um 23:40 schrieb Shashank Sharma:
> >>>>>>> This patch switches the GPU workload mode to/from
> >>>>>>> compute mode, while submitting compute workload.
> >>>>>>>
> >>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
> >>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@xxxxxxx>
> >>>>>>
> >>>>>> Feel free to add my acked-by, but Felix should probably take a look
> >>>>>> as well.
> >>>>>
> >>>>> This look OK purely from a compute perspective. But I'm concerned
> >>>>> about the interaction of compute with graphics or multiple graphics
> >>>>> contexts submitting work concurrently. They would constantly override
> >>>>> or disable each other's workload hints.
> >>>>>
> >>>>> For example, you have an amdgpu_ctx with
> >>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD
> >>>>> process that also wants the compute profile. Those could be different
> >>>>> processes belonging to different users. Say, KFD enables the compute
> >>>>> profile first. Then the graphics context submits a job. At the start
> >>>>> of the job, the compute profile is enabled. That's a no-op because
> >>>>> KFD already enabled the compute profile. When the job finishes, it
> >>>>> disables the compute profile for everyone, including KFD. That's
> >>>>> unexpected.
> >>>>>
> >>>>
> >>>> In this case, it will not disable the compute profile, as the
> >>>> reference counter will not be zero. The reset_profile() will only act
> >>>> if the reference counter is 0.
> >>>
> >>> OK, I missed the reference counter.
> >>>
> >>>
> >>>>
> >>>> But I would be happy to get any inputs about a policy which can be
> >>>> more sustainable and gets better outputs, for example:
> >>>> - should we not allow a profile change, if a PP mode is already
> >>>> applied and keep it Early bird basis ?
> >>>>
> >>>> For example: Policy A
> >>>> - Job A sets the profile to compute
> >>>> - Job B tries to set profile to 3D, but we do not allow it as job A is
> >>>> not finished it yet.
> >>>>
> >>>> Or Policy B: Current one
> >>>> - Job A sets the profile to compute
> >>>> - Job B tries to set profile to 3D, and we allow it. Job A also runs
> >>>> in PP 3D
> >>>> - Job B finishes, but does not reset PP as reference count is not zero
> >>>> due to compute
> >>>> - Job  A finishes, profile reset to NONE
> >>>
> >>> I think this won't work. As I understand it, the
> >>> amdgpu_dpm_switch_power_profile enables and disables individual
> >>> profiles. Disabling the 3D profile doesn't disable the compute profile
> >>> at the same time. I think you'll need one refcount per profile.
> >>>
> >>> Regards,
> >>>     Felix
> >>
> >> Thanks, This is exactly what I was looking for, I think Alex's initial
> >> idea was around it, but I was under the assumption that there is only
> >> one HW profile in SMU which keeps on getting overwritten. This can solve
> >> our problems, as I can create an array of reference counters, and will
> >> disable only the profile whose reference counter goes 0.
> >
> > It's been a while since I paged any of this code into my head, but I
> > believe the actual workload message in the SMU is a mask where you can
> > specify multiple workload types at the same time and the SMU will
> > arbitrate between them internally.  E.g., the most aggressive one will
> > be selected out of the ones specified.  I think in the driver we just
> > set one bit at a time using the current interface.  It might be better
> > to change the interface and just ref count the hint types and then
> > when we call the set function look at the ref counts for each hint
> > type and set the mask as appropriate.
> >
> > Alex
> >
>
> Hey Alex,
> Thanks for your comment, if that is the case, this current patch series
> works straight forward, and no changes would be required. Please let me
> know if my understanding is correct:
>
> Assumption: Order of aggression: 3D > Media > Compute
>
> - Job 1: Requests mode compute: PP changed to compute, ref count 1
> - Job 2: Requests mode media: PP changed to media, ref count 2
> - Job 3: requests mode 3D: PP changed to 3D, ref count 3
> - Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref > 0,
> PP still 3D
> - Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref > 0,
> PP still 3D
> - Job 2 finishes, downs ref count to 0, PP changed to NONE,
>
> In this way, every job will be operating in the Power profile of desired
> aggression or higher, and this API guarantees the execution at-least in
> the desired power profile.

I'm not entirely sure on the relative levels of aggression, but I
believe the SMU priorities them by index.  E.g.
#define WORKLOAD_PPLIB_DEFAULT_BIT        0
#define WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT 1
#define WORKLOAD_PPLIB_POWER_SAVING_BIT   2
#define WORKLOAD_PPLIB_VIDEO_BIT          3
#define WORKLOAD_PPLIB_VR_BIT             4
#define WORKLOAD_PPLIB_COMPUTE_BIT        5
#define WORKLOAD_PPLIB_CUSTOM_BIT         6

3D < video < VR < compute < custom

VR and compute are the most aggressive.  Custom takes preference
because it's user customizable.

Alex

>
> - Shashank
>
> >
> >>
> >> - Shashank
> >>
> >>>
> >>>
> >>>>
> >>>>
> >>>> Or anything else ?
> >>>>
> >>>> REgards
> >>>> Shashank
> >>>>
> >>>>
> >>>>> Or you have multiple VCN contexts. When context1 finishes a job, it
> >>>>> disables the VIDEO profile. But context2 still has a job on the other
> >>>>> VCN engine and wants the VIDEO profile to still be enabled.
> >>>>>
> >>>>> Regards,
> >>>>>     Felix
> >>>>>
> >>>>>
> >>>>>>
> >>>>>> Christian.
> >>>>>>
> >>>>>>> ---
> >>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++++++++++---
> >>>>>>>    1 file changed, 11 insertions(+), 3 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> >>>>>>> index 5e53a5293935..1caed319a448 100644
> >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> >>>>>>> @@ -34,6 +34,7 @@
> >>>>>>>    #include "amdgpu_ras.h"
> >>>>>>>    #include "amdgpu_umc.h"
> >>>>>>>    #include "amdgpu_reset.h"
> >>>>>>> +#include "amdgpu_ctx_workload.h"
> >>>>>>>      /* Total memory size in system memory and all GPU VRAM. Used to
> >>>>>>>     * estimate worst case amount of memory to reserve for page tables
> >>>>>>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct
> >>>>>>> amdgpu_device *adev,
> >>>>>>>      void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev,
> >>>>>>> bool idle)
> >>>>>>>    {
> >>>>>>> -    amdgpu_dpm_switch_power_profile(adev,
> >>>>>>> -                    PP_SMC_POWER_PROFILE_COMPUTE,
> >>>>>>> -                    !idle);
> >>>>>>> +    int ret;
> >>>>>>> +
> >>>>>>> +    if (idle)
> >>>>>>> +        ret = amdgpu_clear_workload_profile(adev,
> >>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);
> >>>>>>> +    else
> >>>>>>> +        ret = amdgpu_set_workload_profile(adev,
> >>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);
> >>>>>>> +
> >>>>>>> +    if (ret)
> >>>>>>> +        drm_warn(&adev->ddev, "Failed to %s power profile to
> >>>>>>> compute mode\n",
> >>>>>>> +             idle ? "reset" : "set");
> >>>>>>>    }
> >>>>>>>      bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32
> >>>>>>> vmid)
> >>>>>>