[Public] > -----Original Message----- > From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Lazar, > Lijo > Sent: Wednesday, July 19, 2023 12:13 AM > To: Zhu, James <James.Zhu@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Zhu, James > <James.Zhu@xxxxxxx>; Kamal, Asad <Asad.Kamal@xxxxxxx>; Zhang, > Hawking <Hawking.Zhang@xxxxxxx> > Subject: Re: [PATCH] drm/amdgpu: Update ring scheduler info as needed > > > > On 7/18/2023 9:39 PM, James Zhu wrote: > > > > On 2023-07-18 11:54, Lazar, Lijo wrote: > >> > >> > >> On 7/18/2023 8:39 PM, James Zhu wrote: > >>> > >>> On 2023-07-18 10:19, Lazar, Lijo wrote: > >>>> > >>>> > >>>> On 7/18/2023 7:30 PM, James Zhu wrote: > >>>>> > >>>>> On 2023-07-18 08:21, Lijo Lazar wrote: > >>>>>> Not all rings have scheduler associated. Only update scheduler > >>>>>> data for rings with scheduler. It could result in out of bound > >>>>>> access as total rings are more than those associated with > >>>>>> particular IPs. > >>>>>> > >>>>>> Signed-off-by: Lijo Lazar <lijo.lazar@xxxxxxx> > >>>>>> --- > >>>>>> drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 2 +- > >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>>>>> > >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c > >>>>>> b/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c > >>>>>> index 72b629a78c62..d0fc62784e82 100644 > >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c > >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c > >>>>>> @@ -134,7 +134,7 @@ static int > >>>>>> aqua_vanjaram_xcp_sched_list_update( > >>>>>> for (i = 0; i < AMDGPU_MAX_RINGS; i++) { > >>>>>> ring = adev->rings[i]; > >>>>>> - if (!ring || !ring->sched.ready) > >>>>>> + if (!ring || !ring->sched.ready || ring->no_scheduler) > >>>>> [JZ] any case for ring->no_scheduler = true, but ring->sched.ready > >>>>> = true? > >>>> > >>>> amdgpu_gfx_kiq_init_ring sets no_scheduler flag true for kiq rings. > >>>> amdgpu_device_init_schedulers inits schedulers only for rings which > >>>> doesn't have no_scheduler. However in gfx_v9_4_3_xcc_kiq_resume, > >>>> the ring is marked with ring->sched.ready = true; > >>>> > >>>> I'm not sure why it's done this way, but this is the case even for > >>>> gfxv9. Driver so far still has some concept abuse on sched.ready. In amdgpu_device_init_schedulers , it's set to be true once setting up sw scheduler for the ring requirement, however, after driver is up, this flag works like a hint to tell the corresponding ring is ready for HW submission after ring test succeeds. For this case, it's probably caused by amdgpu_gfx_enable_kcq calling amdgpu_ring_test_helper, which sets sched.ready unconditionally based on ring test result. This will lead value mismatch between ring->no_scheduler and ring->sched.ready. If my understanding is correct, I think adding a check in this helper function which only sets sched.ready when no_scheduler is false will help. So I will provide a patch soon to prevent this mismatch in future. +if (!ring->no_scheduler) + ring->sched.ready != r; Regards, Guchun > >>> [JZ] I think the sched.ready confuses people. here just means ring > >>> is ready be used. > >> > >> Thanks for the clarification. One question is then do we need to > >> check ring ready status for assigning xcp ids or just check if the > >> ring is associated with a scheduler? > >> > >> Keep only this - if (!ring || ring->no_scheduler) to assign xcp ids > >> and leave the ring ready status to the logic which really accepts > >> jobs on the ring? > > [JZ] I think keeping ring->sched.ready will reduce schedule list which > > will save a little scheduling time. > > Fine, will just push this one. > > Thanks, > Lijo > > >> > >> Thanks, > >> Lijo > >> > >>>> This patch is Reviewed-by: James Zhu <James.Zhu@xxxxxxx> > >>> > >>>> Thanks, > >>>> Lijo > >>>> > >>>>>> continue; > >>>>>> aqua_vanjaram_xcp_gpu_sched_update(adev, ring, > >>>>>> ring->xcp_id);