Re: [PATCH] drm/amdgpu: move buffer funcs setting up a level

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 7, 2023 at 9:19 AM Alex Deucher <alexdeucher@xxxxxxxxx> wrote:
>
> On Tue, Nov 7, 2023 at 5:52 AM Christian König
> <ckoenig.leichtzumerken@xxxxxxxxx> wrote:
> >
> > Am 03.11.23 um 23:10 schrieb Alex Deucher:
> > > On Fri, Nov 3, 2023 at 4:17 PM Alex Deucher <alexdeucher@xxxxxxxxx> wrote:
> > >> On Thu, Oct 26, 2023 at 4:17 PM Luben Tuikov <ltuikov89@xxxxxxxxx> wrote:
> > >>> Pushed to drm-misc-next.
> > >> BTW, I'm seeing the following on older GPUs with VCE and UVD even with
> > >> this patch:
> > >> [   11.886024] amdgpu 0000:0a:00.0: [drm] *ERROR* drm_sched_job_init:
> > >> entity has no rq!
> > >> [   11.886028] amdgpu 0000:0a:00.0: [drm:amdgpu_ib_ring_tests
> > >> [amdgpu]] *ERROR* IB test failed on uvd (-2).
> > >> [   11.889927] amdgpu 0000:0a:00.0: [drm] *ERROR* drm_sched_job_init:
> > >> entity has no rq!
> > >> [   11.889930] amdgpu 0000:0a:00.0: [drm:amdgpu_ib_ring_tests
> > >> [amdgpu]] *ERROR* IB test failed on vce0 (-2).
> > >> [   11.890172] [drm:process_one_work] *ERROR* ib ring test failed (-2).
> > >> Seems to be specific to UVD and VCE, I don't see anything similar with
> > >> VCN, but the flows for both are pretty similar.  Not sure why we are
> > >> not seeing it for VCN.  Just a heads up if you have any ideas.  Will
> > >> take a closer look next week.
> > > + Leo
> > >
> > > I found the problem.  We set up scheduling entities for UVD and VCE
> > > specifically and not for any other engines.  I don't remember why
> > > offhand.  I'm guessing maybe to deal with the session limits on UVD
> > > and VCE?  If so I'm not sure of a clean way to fix this.
> >
> > I haven't looked through all my mails yet so could be that Leo has
> > already answered this.
> >
> > The UVD/VCE entities are used for the older chips where applications
> > have to use create/destroy messages to the firmware.
> >
> > If an application exits without cleaning up their handles the kernel
> > sends the appropriate destroy messages itself. For an example see
> > amdgpu_uvd_free_handles().
> >
> > We used to initialize those entities with separate calls after the
> > scheduler had been brought up, see amdgpu_uvd_entity_init() for an example.
> >
> > But this was somehow messed up and we now do the call to
> > amdgpu_uvd_entity_init() at the end of *_sw_init() instead of _late_init().
> >
> > I suggest to just come up with a function which can be used for the
> > late_init() callback of the UVD/VCE blocks.
>
> I guess the issue is that we only need to initialize the entity once
> so sw_init makes sense.  All of the other functions get called at
> resume time, etc.  I think we could probably put it into
> amdgpu_device_init_schedulers() somehow.

I think something like this might do the trick.

Alex

>
> Alex
>
> >
> > Christian.
> >
> > >
> > > Alex
> > >
> > >> Alex
> > >>
> > >>> Regards,
> > >>> Luben
> > >>>
> > >>> On 2023-10-26 15:52, Luben Tuikov wrote:
> > >>>> On 2023-10-26 15:32, Alex Deucher wrote:
> > >>>>> On Thu, Oct 26, 2023 at 2:22 AM Christian König
> > >>>>> <ckoenig.leichtzumerken@xxxxxxxxx> wrote:
> > >>>>>> Am 25.10.23 um 19:19 schrieb Alex Deucher:
> > >>>>>>> Rather than doing this in the IP code for the SDMA paging
> > >>>>>>> engine, move it up to the core device level init level.
> > >>>>>>> This should fix the scheduler init ordering.
> > >>>>>>>
> > >>>>>>> v2: drop extra parens
> > >>>>>>> v3: drop SDMA helpers
> > >>>>>>>
> > >>>>>>> Tested-by: Luben Tuikov <luben.tuikov@xxxxxxx>
> > >>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
> > >>>>>> I don't know of hand if the high level function really cover everything,
> > >>>>>> so only Acked-by: Christian König <christian.koenig@xxxxxxx> for now.
> > >>>>>>
> > >>>>> Luben,
> > >>>>>
> > >>>>> Was this needed for some of the scheduler stuff that is pending?  If
> > >>>>> you would rather take it via drm-misc to align with the scheduler
> > >>>>> changes, that works for me, otherwise I can take it via the amdgpu
> > >>>>> tree.
> > >>>> Hi Alex,
> > >>>>
> > >>>> Yes, it does.
> > >>>>
> > >>>> I can take it via drm-misc-next as that where the scheduler changes landed.
> > >>>>
> > >>>> I'll add Christian's Acked-by.
> > >>>>
> > >>>> I'll add a Fixes tag because ideally it should've gone before the dynamic
> > >>>> sched_rq commit.
> > >>>>
> > >>>> Thanks for the heads-up!
> > >>>>
> > >>>> Regards,
> > >>>> Luben
> > >>>>
> > >>>>
> > >>>>
> > >>>>> Thanks,
> > >>>>>
> > >>>>> Alex
> > >>>>>
> > >>>>>
> > >>>>>> Christian.
> > >>>>>>
> > >>>>>>> ---
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 15 +++++++++++++++
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c   | 21 ---------------------
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h   |  1 -
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/cik_sdma.c      |  5 -----
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c     |  5 -----
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c     |  5 -----
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c     | 16 +---------------
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c     | 10 +---------
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c     | 10 +---------
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c     | 10 +---------
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/si_dma.c        |  5 -----
> > >>>>>>>    11 files changed, 19 insertions(+), 84 deletions(-)
> > >>>>>>>
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > >>>>>>> index 2031a467b721..5c90080e93ba 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > >>>>>>> @@ -2662,6 +2662,9 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
> > >>>>>>>        if (r)
> > >>>>>>>                goto init_failed;
> > >>>>>>>
> > >>>>>>> +     if (adev->mman.buffer_funcs_ring->sched.ready)
> > >>>>>>> +             amdgpu_ttm_set_buffer_funcs_status(adev, true);
> > >>>>>>> +
> > >>>>>>>        /* Don't init kfd if whole hive need to be reset during init */
> > >>>>>>>        if (!adev->gmc.xgmi.pending_reset) {
> > >>>>>>>                kgd2kfd_init_zone_device(adev);
> > >>>>>>> @@ -3260,6 +3263,8 @@ int amdgpu_device_ip_suspend(struct amdgpu_device *adev)
> > >>>>>>>                amdgpu_virt_request_full_gpu(adev, false);
> > >>>>>>>        }
> > >>>>>>>
> > >>>>>>> +     amdgpu_ttm_set_buffer_funcs_status(adev, false);
> > >>>>>>> +
> > >>>>>>>        r = amdgpu_device_ip_suspend_phase1(adev);
> > >>>>>>>        if (r)
> > >>>>>>>                return r;
> > >>>>>>> @@ -3449,6 +3454,9 @@ static int amdgpu_device_ip_resume(struct amdgpu_device *adev)
> > >>>>>>>
> > >>>>>>>        r = amdgpu_device_ip_resume_phase2(adev);
> > >>>>>>>
> > >>>>>>> +     if (adev->mman.buffer_funcs_ring->sched.ready)
> > >>>>>>> +             amdgpu_ttm_set_buffer_funcs_status(adev, true);
> > >>>>>>> +
> > >>>>>>>        return r;
> > >>>>>>>    }
> > >>>>>>>
> > >>>>>>> @@ -4236,6 +4244,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
> > >>>>>>>        /* disable ras feature must before hw fini */
> > >>>>>>>        amdgpu_ras_pre_fini(adev);
> > >>>>>>>
> > >>>>>>> +     amdgpu_ttm_set_buffer_funcs_status(adev, false);
> > >>>>>>> +
> > >>>>>>>        amdgpu_device_ip_fini_early(adev);
> > >>>>>>>
> > >>>>>>>        amdgpu_irq_fini_hw(adev);
> > >>>>>>> @@ -4407,6 +4417,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool fbcon)
> > >>>>>>>
> > >>>>>>>        amdgpu_ras_suspend(adev);
> > >>>>>>>
> > >>>>>>> +     amdgpu_ttm_set_buffer_funcs_status(adev, false);
> > >>>>>>> +
> > >>>>>>>        amdgpu_device_ip_suspend_phase1(adev);
> > >>>>>>>
> > >>>>>>>        if (!adev->in_s0ix)
> > >>>>>>> @@ -5178,6 +5190,9 @@ int amdgpu_do_asic_reset(struct list_head *device_list_handle,
> > >>>>>>>                                if (r)
> > >>>>>>>                                        goto out;
> > >>>>>>>
> > >>>>>>> +                             if (tmp_adev->mman.buffer_funcs_ring->sched.ready)
> > >>>>>>> +                                     amdgpu_ttm_set_buffer_funcs_status(tmp_adev, true);
> > >>>>>>> +
> > >>>>>>>                                if (vram_lost)
> > >>>>>>>                                        amdgpu_device_fill_reset_magic(tmp_adev);
> > >>>>>>>
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> > >>>>>>> index e8cbc4142d80..1d9d187de6ee 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> > >>>>>>> @@ -292,27 +292,6 @@ int amdgpu_sdma_init_microcode(struct amdgpu_device *adev,
> > >>>>>>>        return err;
> > >>>>>>>    }
> > >>>>>>>
> > >>>>>>> -void amdgpu_sdma_unset_buffer_funcs_helper(struct amdgpu_device *adev)
> > >>>>>>> -{
> > >>>>>>> -     struct amdgpu_ring *sdma;
> > >>>>>>> -     int i;
> > >>>>>>> -
> > >>>>>>> -     for (i = 0; i < adev->sdma.num_instances; i++) {
> > >>>>>>> -             if (adev->sdma.has_page_queue) {
> > >>>>>>> -                     sdma = &adev->sdma.instance[i].page;
> > >>>>>>> -                     if (adev->mman.buffer_funcs_ring == sdma) {
> > >>>>>>> -                             amdgpu_ttm_set_buffer_funcs_status(adev, false);
> > >>>>>>> -                             break;
> > >>>>>>> -                     }
> > >>>>>>> -             }
> > >>>>>>> -             sdma = &adev->sdma.instance[i].ring;
> > >>>>>>> -             if (adev->mman.buffer_funcs_ring == sdma) {
> > >>>>>>> -                     amdgpu_ttm_set_buffer_funcs_status(adev, false);
> > >>>>>>> -                     break;
> > >>>>>>> -             }
> > >>>>>>> -     }
> > >>>>>>> -}
> > >>>>>>> -
> > >>>>>>>    int amdgpu_sdma_ras_sw_init(struct amdgpu_device *adev)
> > >>>>>>>    {
> > >>>>>>>        int err = 0;
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> > >>>>>>> index 513ac22120c1..173a2a308078 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> > >>>>>>> @@ -169,7 +169,6 @@ int amdgpu_sdma_init_microcode(struct amdgpu_device *adev, u32 instance,
> > >>>>>>>                               bool duplicate);
> > >>>>>>>    void amdgpu_sdma_destroy_inst_ctx(struct amdgpu_device *adev,
> > >>>>>>>            bool duplicate);
> > >>>>>>> -void amdgpu_sdma_unset_buffer_funcs_helper(struct amdgpu_device *adev);
> > >>>>>>>    int amdgpu_sdma_ras_sw_init(struct amdgpu_device *adev);
> > >>>>>>>
> > >>>>>>>    #endif
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
> > >>>>>>> index ee5dce6f6043..a3fccc4c1f43 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
> > >>>>>>> @@ -308,8 +308,6 @@ static void cik_sdma_gfx_stop(struct amdgpu_device *adev)
> > >>>>>>>        u32 rb_cntl;
> > >>>>>>>        int i;
> > >>>>>>>
> > >>>>>>> -     amdgpu_sdma_unset_buffer_funcs_helper(adev);
> > >>>>>>> -
> > >>>>>>>        for (i = 0; i < adev->sdma.num_instances; i++) {
> > >>>>>>>                rb_cntl = RREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i]);
> > >>>>>>>                rb_cntl &= ~SDMA0_GFX_RB_CNTL__RB_ENABLE_MASK;
> > >>>>>>> @@ -498,9 +496,6 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev)
> > >>>>>>>                r = amdgpu_ring_test_helper(ring);
> > >>>>>>>                if (r)
> > >>>>>>>                        return r;
> > >>>>>>> -
> > >>>>>>> -             if (adev->mman.buffer_funcs_ring == ring)
> > >>>>>>> -                     amdgpu_ttm_set_buffer_funcs_status(adev, true);
> > >>>>>>>        }
> > >>>>>>>
> > >>>>>>>        return 0;
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
> > >>>>>>> index b58a13bd75db..45377a175250 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
> > >>>>>>> @@ -339,8 +339,6 @@ static void sdma_v2_4_gfx_stop(struct amdgpu_device *adev)
> > >>>>>>>        u32 rb_cntl, ib_cntl;
> > >>>>>>>        int i;
> > >>>>>>>
> > >>>>>>> -     amdgpu_sdma_unset_buffer_funcs_helper(adev);
> > >>>>>>> -
> > >>>>>>>        for (i = 0; i < adev->sdma.num_instances; i++) {
> > >>>>>>>                rb_cntl = RREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i]);
> > >>>>>>>                rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 0);
> > >>>>>>> @@ -474,9 +472,6 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev)
> > >>>>>>>                r = amdgpu_ring_test_helper(ring);
> > >>>>>>>                if (r)
> > >>>>>>>                        return r;
> > >>>>>>> -
> > >>>>>>> -             if (adev->mman.buffer_funcs_ring == ring)
> > >>>>>>> -                     amdgpu_ttm_set_buffer_funcs_status(adev, true);
> > >>>>>>>        }
> > >>>>>>>
> > >>>>>>>        return 0;
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> > >>>>>>> index c5ea32687eb5..2ad615be4bb3 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> > >>>>>>> @@ -513,8 +513,6 @@ static void sdma_v3_0_gfx_stop(struct amdgpu_device *adev)
> > >>>>>>>        u32 rb_cntl, ib_cntl;
> > >>>>>>>        int i;
> > >>>>>>>
> > >>>>>>> -     amdgpu_sdma_unset_buffer_funcs_helper(adev);
> > >>>>>>> -
> > >>>>>>>        for (i = 0; i < adev->sdma.num_instances; i++) {
> > >>>>>>>                rb_cntl = RREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i]);
> > >>>>>>>                rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 0);
> > >>>>>>> @@ -746,9 +744,6 @@ static int sdma_v3_0_gfx_resume(struct amdgpu_device *adev)
> > >>>>>>>                r = amdgpu_ring_test_helper(ring);
> > >>>>>>>                if (r)
> > >>>>>>>                        return r;
> > >>>>>>> -
> > >>>>>>> -             if (adev->mman.buffer_funcs_ring == ring)
> > >>>>>>> -                     amdgpu_ttm_set_buffer_funcs_status(adev, true);
> > >>>>>>>        }
> > >>>>>>>
> > >>>>>>>        return 0;
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> > >>>>>>> index 683d51ae4bf1..3d68dd5523c6 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> > >>>>>>> @@ -877,8 +877,6 @@ static void sdma_v4_0_gfx_enable(struct amdgpu_device *adev, bool enable)
> > >>>>>>>        u32 rb_cntl, ib_cntl;
> > >>>>>>>        int i;
> > >>>>>>>
> > >>>>>>> -     amdgpu_sdma_unset_buffer_funcs_helper(adev);
> > >>>>>>> -
> > >>>>>>>        for (i = 0; i < adev->sdma.num_instances; i++) {
> > >>>>>>>                rb_cntl = RREG32_SDMA(i, mmSDMA0_GFX_RB_CNTL);
> > >>>>>>>                rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, enable ? 1 : 0);
> > >>>>>>> @@ -913,8 +911,6 @@ static void sdma_v4_0_page_stop(struct amdgpu_device *adev)
> > >>>>>>>        u32 rb_cntl, ib_cntl;
> > >>>>>>>        int i;
> > >>>>>>>
> > >>>>>>> -     amdgpu_sdma_unset_buffer_funcs_helper(adev);
> > >>>>>>> -
> > >>>>>>>        for (i = 0; i < adev->sdma.num_instances; i++) {
> > >>>>>>>                rb_cntl = RREG32_SDMA(i, mmSDMA0_PAGE_RB_CNTL);
> > >>>>>>>                rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_PAGE_RB_CNTL,
> > >>>>>>> @@ -1402,13 +1398,7 @@ static int sdma_v4_0_start(struct amdgpu_device *adev)
> > >>>>>>>                        r = amdgpu_ring_test_helper(page);
> > >>>>>>>                        if (r)
> > >>>>>>>                                return r;
> > >>>>>>> -
> > >>>>>>> -                     if (adev->mman.buffer_funcs_ring == page)
> > >>>>>>> -                             amdgpu_ttm_set_buffer_funcs_status(adev, true);
> > >>>>>>>                }
> > >>>>>>> -
> > >>>>>>> -             if (adev->mman.buffer_funcs_ring == ring)
> > >>>>>>> -                     amdgpu_ttm_set_buffer_funcs_status(adev, true);
> > >>>>>>>        }
> > >>>>>>>
> > >>>>>>>        return r;
> > >>>>>>> @@ -1921,11 +1911,8 @@ static int sdma_v4_0_hw_fini(void *handle)
> > >>>>>>>        struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> > >>>>>>>        int i;
> > >>>>>>>
> > >>>>>>> -     if (amdgpu_sriov_vf(adev)) {
> > >>>>>>> -             /* disable the scheduler for SDMA */
> > >>>>>>> -             amdgpu_sdma_unset_buffer_funcs_helper(adev);
> > >>>>>>> +     if (amdgpu_sriov_vf(adev))
> > >>>>>>>                return 0;
> > >>>>>>> -     }
> > >>>>>>>
> > >>>>>>>        if (amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__SDMA)) {
> > >>>>>>>                for (i = 0; i < adev->sdma.num_instances; i++) {
> > >>>>>>> @@ -1964,7 +1951,6 @@ static int sdma_v4_0_resume(void *handle)
> > >>>>>>>        if (adev->in_s0ix) {
> > >>>>>>>                sdma_v4_0_enable(adev, true);
> > >>>>>>>                sdma_v4_0_gfx_enable(adev, true);
> > >>>>>>> -             amdgpu_ttm_set_buffer_funcs_status(adev, true);
> > >>>>>>>                return 0;
> > >>>>>>>        }
> > >>>>>>>
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> > >>>>>>> index be5d099c9898..c78027ebdcb9 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> > >>>>>>> @@ -559,8 +559,6 @@ static void sdma_v5_0_gfx_stop(struct amdgpu_device *adev)
> > >>>>>>>        u32 rb_cntl, ib_cntl;
> > >>>>>>>        int i;
> > >>>>>>>
> > >>>>>>> -     amdgpu_sdma_unset_buffer_funcs_helper(adev);
> > >>>>>>> -
> > >>>>>>>        for (i = 0; i < adev->sdma.num_instances; i++) {
> > >>>>>>>                rb_cntl = RREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL));
> > >>>>>>>                rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 0);
> > >>>>>>> @@ -825,9 +823,6 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)
> > >>>>>>>                r = amdgpu_ring_test_helper(ring);
> > >>>>>>>                if (r)
> > >>>>>>>                        return r;
> > >>>>>>> -
> > >>>>>>> -             if (adev->mman.buffer_funcs_ring == ring)
> > >>>>>>> -                     amdgpu_ttm_set_buffer_funcs_status(adev, true);
> > >>>>>>>        }
> > >>>>>>>
> > >>>>>>>        return 0;
> > >>>>>>> @@ -1426,11 +1421,8 @@ static int sdma_v5_0_hw_fini(void *handle)
> > >>>>>>>    {
> > >>>>>>>        struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> > >>>>>>>
> > >>>>>>> -     if (amdgpu_sriov_vf(adev)) {
> > >>>>>>> -             /* disable the scheduler for SDMA */
> > >>>>>>> -             amdgpu_sdma_unset_buffer_funcs_helper(adev);
> > >>>>>>> +     if (amdgpu_sriov_vf(adev))
> > >>>>>>>                return 0;
> > >>>>>>> -     }
> > >>>>>>>
> > >>>>>>>        sdma_v5_0_ctx_switch_enable(adev, false);
> > >>>>>>>        sdma_v5_0_enable(adev, false);
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> > >>>>>>> index a3e8b10c071c..2e35f3571774 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> > >>>>>>> @@ -364,8 +364,6 @@ static void sdma_v5_2_gfx_stop(struct amdgpu_device *adev)
> > >>>>>>>        u32 rb_cntl, ib_cntl;
> > >>>>>>>        int i;
> > >>>>>>>
> > >>>>>>> -     amdgpu_sdma_unset_buffer_funcs_helper(adev);
> > >>>>>>> -
> > >>>>>>>        for (i = 0; i < adev->sdma.num_instances; i++) {
> > >>>>>>>                rb_cntl = RREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL));
> > >>>>>>>                rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 0);
> > >>>>>>> @@ -625,9 +623,6 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
> > >>>>>>>                r = amdgpu_ring_test_helper(ring);
> > >>>>>>>                if (r)
> > >>>>>>>                        return r;
> > >>>>>>> -
> > >>>>>>> -             if (adev->mman.buffer_funcs_ring == ring)
> > >>>>>>> -                     amdgpu_ttm_set_buffer_funcs_status(adev, true);
> > >>>>>>>        }
> > >>>>>>>
> > >>>>>>>        return 0;
> > >>>>>>> @@ -1284,11 +1279,8 @@ static int sdma_v5_2_hw_fini(void *handle)
> > >>>>>>>    {
> > >>>>>>>        struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> > >>>>>>>
> > >>>>>>> -     if (amdgpu_sriov_vf(adev)) {
> > >>>>>>> -             /* disable the scheduler for SDMA */
> > >>>>>>> -             amdgpu_sdma_unset_buffer_funcs_helper(adev);
> > >>>>>>> +     if (amdgpu_sriov_vf(adev))
> > >>>>>>>                return 0;
> > >>>>>>> -     }
> > >>>>>>>
> > >>>>>>>        sdma_v5_2_ctx_switch_enable(adev, false);
> > >>>>>>>        sdma_v5_2_enable(adev, false);
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
> > >>>>>>> index 445a34549d2c..1c6ff511f501 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
> > >>>>>>> @@ -348,8 +348,6 @@ static void sdma_v6_0_gfx_stop(struct amdgpu_device *adev)
> > >>>>>>>        u32 rb_cntl, ib_cntl;
> > >>>>>>>        int i;
> > >>>>>>>
> > >>>>>>> -     amdgpu_sdma_unset_buffer_funcs_helper(adev);
> > >>>>>>> -
> > >>>>>>>        for (i = 0; i < adev->sdma.num_instances; i++) {
> > >>>>>>>                rb_cntl = RREG32_SOC15_IP(GC, sdma_v6_0_get_reg_offset(adev, i, regSDMA0_QUEUE0_RB_CNTL));
> > >>>>>>>                rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_QUEUE0_RB_CNTL, RB_ENABLE, 0);
> > >>>>>>> @@ -561,9 +559,6 @@ static int sdma_v6_0_gfx_resume(struct amdgpu_device *adev)
> > >>>>>>>                r = amdgpu_ring_test_helper(ring);
> > >>>>>>>                if (r)
> > >>>>>>>                        return r;
> > >>>>>>> -
> > >>>>>>> -             if (adev->mman.buffer_funcs_ring == ring)
> > >>>>>>> -                     amdgpu_ttm_set_buffer_funcs_status(adev, true);
> > >>>>>>>        }
> > >>>>>>>
> > >>>>>>>        return 0;
> > >>>>>>> @@ -1308,11 +1303,8 @@ static int sdma_v6_0_hw_fini(void *handle)
> > >>>>>>>    {
> > >>>>>>>        struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> > >>>>>>>
> > >>>>>>> -     if (amdgpu_sriov_vf(adev)) {
> > >>>>>>> -             /* disable the scheduler for SDMA */
> > >>>>>>> -             amdgpu_sdma_unset_buffer_funcs_helper(adev);
> > >>>>>>> +     if (amdgpu_sriov_vf(adev))
> > >>>>>>>                return 0;
> > >>>>>>> -     }
> > >>>>>>>
> > >>>>>>>        sdma_v6_0_ctxempty_int_enable(adev, false);
> > >>>>>>>        sdma_v6_0_enable(adev, false);
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/si_dma.c b/drivers/gpu/drm/amd/amdgpu/si_dma.c
> > >>>>>>> index 42c4547f32ec..9aa0e11ee673 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/si_dma.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/si_dma.c
> > >>>>>>> @@ -115,8 +115,6 @@ static void si_dma_stop(struct amdgpu_device *adev)
> > >>>>>>>        u32 rb_cntl;
> > >>>>>>>        unsigned i;
> > >>>>>>>
> > >>>>>>> -     amdgpu_sdma_unset_buffer_funcs_helper(adev);
> > >>>>>>> -
> > >>>>>>>        for (i = 0; i < adev->sdma.num_instances; i++) {
> > >>>>>>>                /* dma0 */
> > >>>>>>>                rb_cntl = RREG32(DMA_RB_CNTL + sdma_offsets[i]);
> > >>>>>>> @@ -177,9 +175,6 @@ static int si_dma_start(struct amdgpu_device *adev)
> > >>>>>>>                r = amdgpu_ring_test_helper(ring);
> > >>>>>>>                if (r)
> > >>>>>>>                        return r;
> > >>>>>>> -
> > >>>>>>> -             if (adev->mman.buffer_funcs_ring == ring)
> > >>>>>>> -                     amdgpu_ttm_set_buffer_funcs_status(adev, true);
> > >>>>>>>        }
> > >>>>>>>
> > >>>>>>>        return 0;
> >
From 29cce6524ce556efa47176205fe28f0e4f687d68 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@xxxxxxx>
Date: Tue, 7 Nov 2023 09:43:33 -0500
Subject: [PATCH] drm/amdgpu: move UVD and VCE sched entity init after sched
 init

We need kernel scheduling entities to deal with handle clean up
if apps are not cleaned up properly.  With commit 56e449603f0ac5
("drm/sched: Convert the GPU scheduler to variable number of run-queues")
the scheduler entities have to be created after scheduler init, so
change the ordering to fix this.

Fixes: 56e449603f0ac5 ("drm/sched: Convert the GPU scheduler to variable number of run-queues")
Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 22 ++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c    | 24 ----------------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h    |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c    | 24 ----------------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h    |  1 -
 drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c      |  2 --
 drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c      |  2 --
 drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c      |  2 --
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c      |  2 --
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c      |  4 ----
 drivers/gpu/drm/amd/amdgpu/vce_v2_0.c      |  2 --
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c      |  2 --
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c      |  5 -----
 13 files changed, 22 insertions(+), 71 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index e2199d8fd30e..7b0da4442abb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2463,6 +2463,7 @@ static int amdgpu_device_fw_loading(struct amdgpu_device *adev)
 
 static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
 {
+	struct drm_gpu_scheduler *sched;
 	long timeout;
 	int r, i;
 
@@ -2498,6 +2499,27 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
 				  ring->name);
 			return r;
 		}
+		/* set up the UVD and VCE entities to properly deal with handles */
+		if (ring == &adev->uvd.inst[0].ring) {
+			sched = &ring->sched;
+			r = drm_sched_entity_init(&adev->uvd.entity, DRM_SCHED_PRIORITY_NORMAL,
+						  &sched, 1, NULL);
+			if (r) {
+				DRM_ERROR("Failed to create UVD scheduling entity on ring %s.\n",
+					  ring->name);
+				return r;
+			}
+		}
+		if (ring == &adev->vce.ring[0]) {
+			sched = &ring->sched;
+			r = drm_sched_entity_init(&adev->vce.entity, DRM_SCHED_PRIORITY_NORMAL,
+						  &sched, 1, NULL);
+			if (r) {
+				DRM_ERROR("Failed to create VCE scheduling entity on ring %s.\n",
+					  ring->name);
+				return r;
+			}
+		}
 	}
 
 	amdgpu_xcp_update_partition_sched_list(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index 815b7c34ed33..4abcb09befeb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -394,30 +394,6 @@ int amdgpu_uvd_sw_fini(struct amdgpu_device *adev)
 	return 0;
 }
 
-/**
- * amdgpu_uvd_entity_init - init entity
- *
- * @adev: amdgpu_device pointer
- *
- */
-int amdgpu_uvd_entity_init(struct amdgpu_device *adev)
-{
-	struct amdgpu_ring *ring;
-	struct drm_gpu_scheduler *sched;
-	int r;
-
-	ring = &adev->uvd.inst[0].ring;
-	sched = &ring->sched;
-	r = drm_sched_entity_init(&adev->uvd.entity, DRM_SCHED_PRIORITY_NORMAL,
-				  &sched, 1, NULL);
-	if (r) {
-		DRM_ERROR("Failed setting up UVD kernel entity.\n");
-		return r;
-	}
-
-	return 0;
-}
-
 int amdgpu_uvd_prepare_suspend(struct amdgpu_device *adev)
 {
 	unsigned int size;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
index a9f342537c68..7c8e7e2b731d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
@@ -73,7 +73,6 @@ struct amdgpu_uvd {
 
 int amdgpu_uvd_sw_init(struct amdgpu_device *adev);
 int amdgpu_uvd_sw_fini(struct amdgpu_device *adev);
-int amdgpu_uvd_entity_init(struct amdgpu_device *adev);
 int amdgpu_uvd_prepare_suspend(struct amdgpu_device *adev);
 int amdgpu_uvd_suspend(struct amdgpu_device *adev);
 int amdgpu_uvd_resume(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index 1904edf68407..d9deda91c5d0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -226,30 +226,6 @@ int amdgpu_vce_sw_fini(struct amdgpu_device *adev)
 	return 0;
 }
 
-/**
- * amdgpu_vce_entity_init - init entity
- *
- * @adev: amdgpu_device pointer
- *
- */
-int amdgpu_vce_entity_init(struct amdgpu_device *adev)
-{
-	struct amdgpu_ring *ring;
-	struct drm_gpu_scheduler *sched;
-	int r;
-
-	ring = &adev->vce.ring[0];
-	sched = &ring->sched;
-	r = drm_sched_entity_init(&adev->vce.entity, DRM_SCHED_PRIORITY_NORMAL,
-				  &sched, 1, NULL);
-	if (r != 0) {
-		DRM_ERROR("Failed setting up VCE run queue.\n");
-		return r;
-	}
-
-	return 0;
-}
-
 /**
  * amdgpu_vce_suspend - unpin VCE fw memory
  *
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
index ea680fc9a6c3..ee75f691f28f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
@@ -55,7 +55,6 @@ struct amdgpu_vce {
 
 int amdgpu_vce_sw_init(struct amdgpu_device *adev, unsigned long size);
 int amdgpu_vce_sw_fini(struct amdgpu_device *adev);
-int amdgpu_vce_entity_init(struct amdgpu_device *adev);
 int amdgpu_vce_suspend(struct amdgpu_device *adev);
 int amdgpu_vce_resume(struct amdgpu_device *adev);
 void amdgpu_vce_free_handles(struct amdgpu_device *adev, struct drm_file *filp);
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
index 58a8f78c003c..a6006f231c65 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c
@@ -577,8 +577,6 @@ static int uvd_v3_1_sw_init(void *handle)
 	ptr += ucode_len;
 	memcpy(&adev->uvd.keyselect, ptr, 4);
 
-	r = amdgpu_uvd_entity_init(adev);
-
 	return r;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
index d3b1e31f5450..1aa09ad7bbe3 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
@@ -127,8 +127,6 @@ static int uvd_v4_2_sw_init(void *handle)
 	if (r)
 		return r;
 
-	r = amdgpu_uvd_entity_init(adev);
-
 	return r;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c
index 5a8116437abf..f8b229b75435 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c
@@ -125,8 +125,6 @@ static int uvd_v5_0_sw_init(void *handle)
 	if (r)
 		return r;
 
-	r = amdgpu_uvd_entity_init(adev);
-
 	return r;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index 74c09230aeb3..a9a6880f44e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -432,8 +432,6 @@ static int uvd_v6_0_sw_init(void *handle)
 		}
 	}
 
-	r = amdgpu_uvd_entity_init(adev);
-
 	return r;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
index 1c42cf10cc29..6068b784dc69 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
@@ -480,10 +480,6 @@ static int uvd_v7_0_sw_init(void *handle)
 	if (r)
 		return r;
 
-	r = amdgpu_uvd_entity_init(adev);
-	if (r)
-		return r;
-
 	r = amdgpu_virt_alloc_mm_table(adev);
 	if (r)
 		return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
index 67eb01fef789..a08e7abca423 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
@@ -441,8 +441,6 @@ static int vce_v2_0_sw_init(void *handle)
 			return r;
 	}
 
-	r = amdgpu_vce_entity_init(adev);
-
 	return r;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 18f6e62af339..f4760748d349 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -450,8 +450,6 @@ static int vce_v3_0_sw_init(void *handle)
 			return r;
 	}
 
-	r = amdgpu_vce_entity_init(adev);
-
 	return r;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index e0b70cd3b697..06d787385ad4 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -486,11 +486,6 @@ static int vce_v4_0_sw_init(void *handle)
 			return r;
 	}
 
-
-	r = amdgpu_vce_entity_init(adev);
-	if (r)
-		return r;
-
 	r = amdgpu_virt_alloc_mm_table(adev);
 	if (r)
 		return r;
-- 
2.41.0


[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux