On 12/1/2022 12:52 PM, Liang, Prike wrote:
[Public]
-----Original Message-----
From: Lazar, Lijo <Lijo.Lazar@xxxxxxx>
Sent: Thursday, December 1, 2022 2:39 PM
To: Liang, Prike <Prike.Liang@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Limonciello, Mario <Mario.Limonciello@xxxxxxx>; stable@xxxxxxxxxxxxxxx
Subject: Re: [PATCH] drm/amdgpu/sdma_v4_0: turn off SDMA ring buffer in the s2idle suspend
On 12/1/2022 11:52 AM, Prike Liang wrote:
In the SDMA s0ix save process requires to turn off SDMA ring buffer
for avoiding the SDMA in-flight request, otherwise will suffer from
SDMA page fault which causes by page request from in-flight SDMA ring
accessing at SDMA restore phase.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2248
Cc: stable@xxxxxxxxxxxxxxx # 6.0
Fixes: f8f4e2a51834 ("drm/amdgpu: skipping SDMA hw_init and hw_fini
for S0ix.")
Signed-off-by: Prike Liang <Prike.Liang@xxxxxxx>
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index 1122bd4eae98..2b9fe9f00343 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -913,7 +913,7 @@ static void sdma_v4_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 se
*
* Stop the gfx async dma ring buffers (VEGA10).
*/
-static void sdma_v4_0_gfx_stop(struct amdgpu_device *adev)
+static void sdma_v4_0_gfx_stop(struct amdgpu_device *adev, bool stop)
Better to rename as sdma_v4_0_gfx_enable(struct amdgpu_device *adev, bool enable).
Thanks,
Lijo
Ah, before this version I do use the sdma_v4_0_gfx_enable() name in the primary draft, but choose the sdma_v4_0_gfx_stop() for re-using the function name and comment info at the patch clean up.
AFAICS, use the _enable() name may can match well with the function job which does the SDMA ring enable bit setting, any other reason require to change the name here?
Right, enable() matches better and it's easier to read the code in
resume function with enable(true), rather than stop(false) :)
Thanks,
Lijo
{
u32 rb_cntl, ib_cntl;
int i;
@@ -922,10 +922,10 @@ static void sdma_v4_0_gfx_stop(struct
amdgpu_device *adev)
for (i = 0; i < adev->sdma.num_instances; i++) {
rb_cntl = RREG32_SDMA(i, mmSDMA0_GFX_RB_CNTL);
- rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 0);
+ rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, stop
+? 0 : 1);
WREG32_SDMA(i, mmSDMA0_GFX_RB_CNTL, rb_cntl);
ib_cntl = RREG32_SDMA(i, mmSDMA0_GFX_IB_CNTL);
- ib_cntl = REG_SET_FIELD(ib_cntl, SDMA0_GFX_IB_CNTL, IB_ENABLE, 0);
+ ib_cntl = REG_SET_FIELD(ib_cntl, SDMA0_GFX_IB_CNTL, IB_ENABLE, stop
+? 0 : 1);
WREG32_SDMA(i, mmSDMA0_GFX_IB_CNTL, ib_cntl);
}
}
@@ -1044,7 +1044,7 @@ static void sdma_v4_0_enable(struct amdgpu_device *adev, bool enable)
int i;
if (!enable) {
- sdma_v4_0_gfx_stop(adev);
+ sdma_v4_0_gfx_stop(adev, true);
sdma_v4_0_rlc_stop(adev);
if (adev->sdma.has_page_queue)
sdma_v4_0_page_stop(adev);
@@ -1960,8 +1960,10 @@ static int sdma_v4_0_suspend(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
/* SMU saves SDMA state for us */
- if (adev->in_s0ix)
+ if (adev->in_s0ix) {
+ sdma_v4_0_gfx_stop(adev, true);
return 0;
+ }
return sdma_v4_0_hw_fini(adev);
}
@@ -1971,8 +1973,12 @@ static int sdma_v4_0_resume(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
/* SMU restores SDMA state for us */
- if (adev->in_s0ix)
+ if (adev->in_s0ix) {
+ sdma_v4_0_enable(adev, true);
+ sdma_v4_0_gfx_stop(adev, false);
+ amdgpu_ttm_set_buffer_funcs_status(adev, true);
return 0;
+ }
return sdma_v4_0_hw_init(adev);
}