Re: [PATCH] drm/amdgpu: Update ring scheduler info as needed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 7/19/2023 10:12 AM, Chen, Guchun wrote:
[Public]

-----Original Message-----
From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Lazar,
Lijo
Sent: Wednesday, July 19, 2023 12:13 AM
To: Zhu, James <James.Zhu@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Zhu, James
<James.Zhu@xxxxxxx>; Kamal, Asad <Asad.Kamal@xxxxxxx>; Zhang,
Hawking <Hawking.Zhang@xxxxxxx>
Subject: Re: [PATCH] drm/amdgpu: Update ring scheduler info as needed



On 7/18/2023 9:39 PM, James Zhu wrote:

On 2023-07-18 11:54, Lazar, Lijo wrote:


On 7/18/2023 8:39 PM, James Zhu wrote:

On 2023-07-18 10:19, Lazar, Lijo wrote:


On 7/18/2023 7:30 PM, James Zhu wrote:

On 2023-07-18 08:21, Lijo Lazar wrote:
Not all rings have scheduler associated. Only update scheduler
data for rings with scheduler. It could result in out of bound
access as total rings are more than those associated with
particular IPs.

Signed-off-by: Lijo Lazar <lijo.lazar@xxxxxxx>
---
   drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c
b/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c
index 72b629a78c62..d0fc62784e82 100644
--- a/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c
+++ b/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c
@@ -134,7 +134,7 @@ static int
aqua_vanjaram_xcp_sched_list_update(
       for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
           ring = adev->rings[i];
-        if (!ring || !ring->sched.ready)
+        if (!ring || !ring->sched.ready || ring->no_scheduler)
[JZ] any case for ring->no_scheduler = true, but ring->sched.ready
= true?

amdgpu_gfx_kiq_init_ring sets no_scheduler flag true for kiq rings.
amdgpu_device_init_schedulers inits schedulers only for rings which
doesn't have no_scheduler. However in gfx_v9_4_3_xcc_kiq_resume,
the ring is marked with ring->sched.ready = true;

I'm not sure why it's done this way, but this is the case even for
gfxv9.

Driver so far still has some concept abuse on sched.ready. In amdgpu_device_init_schedulers , it's set to be true once setting up sw scheduler for the ring requirement, however, after driver is up, this flag works like a hint to tell the corresponding ring is ready for HW submission after ring test succeeds.

For this case, it's probably caused by amdgpu_gfx_enable_kcq  calling amdgpu_ring_test_helper, which sets sched.ready unconditionally based on ring test result. This will lead value mismatch between ring->no_scheduler and ring->sched.ready. If my understanding is correct, I think adding a check in this helper function which only sets sched.ready when no_scheduler is false will help.  So I will provide a patch soon to prevent this mismatch in future.

+if (!ring->no_scheduler)
+       ring->sched.ready != r;

The ring.ready(ring->sched.ready) flag is used in gmcv9 code as well.This will need more rework.

Thanks,
Lijo

Regards,
Guchun

[JZ] I think the sched.ready confuses people. here just means ring
is ready be used.

Thanks for the clarification. One question is then do we need to
check ring ready status for assigning xcp ids or just check if the
ring is associated with a scheduler?

Keep only this - if (!ring || ring->no_scheduler) to assign xcp ids
and leave the ring ready status to the logic which really accepts
jobs on the ring?
[JZ] I think keeping ring->sched.ready will reduce schedule list which
will save a little scheduling time.

Fine, will just push this one.

Thanks,
Lijo


Thanks,
Lijo

This patch is Reviewed-by: James Zhu <James.Zhu@xxxxxxx>

Thanks,
Lijo

               continue;
           aqua_vanjaram_xcp_gpu_sched_update(adev, ring,
ring->xcp_id);



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux