On Wed, Oct 16, 2024 at 2:19 AM Min, Frank <Frank.Min@xxxxxxx> wrote: > > [AMD Official Use Only - AMD Internal Distribution Only] > > Hi Lijo, > Thanks a lot for your review and suggestions. > > Here are the updated patches. Patches look good, but I agree with Lijo that we should switch to a dedicated sdma7 header at some point in case there are other discrepancies, but I think that can be a later update. Some additional comments below. > > Best Regards, > Frank > > From: Frank Min <Frank.Min@xxxxxxx> > Date: Thu, 10 Oct 2024 16:41:32 +0800 > Subject: [PATCH 1/2] drm/amdgpu: fix random data corruption for sdma 7 > > There is random data corruption caused by const fill, this is caused by > write compression mode not correctly configured. > > So correct compression mode for const fill. > > Signed-off-by: Frank Min <Frank.Min@xxxxxxx> > --- > drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h | 6 ++++++ > drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 3 ++- > 2 files changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h > index d8cf830916b9..9f43c14f0bb0 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h > @@ -4346,6 +4346,12 @@ > #define SDMA_PKT_CONSTANT_FILL_HEADER_sw_shift 16 > #define SDMA_PKT_CONSTANT_FILL_HEADER_SW(x) (((x) & SDMA_PKT_CONSTANT_FILL_HEADER_sw_mask) << SDMA_PKT_CONSTANT_FILL_HEADER_sw_shift) > > +/*define for compression field for sdma7*/ > +#define SDMA_PKT_CONSTANT_FILL_HEADER_compress_offset 0 > +#define SDMA_PKT_CONSTANT_FILL_HEADER_compress_mask 0x00000001 > +#define SDMA_PKT_CONSTANT_FILL_HEADER_compress_shift 16 > +#define SDMA_PKT_CONSTANT_FILL_HEADER_COMPRESS(x) (((x) & SDMA_PKT_CONSTANT_FILL_HEADER_compress_mask) << SDMA_PKT_CONSTANT_FILL_HEADER_compress_shift) Might be better to put this in sdma_v7_0.c since it's not part of sdma6. With that fixed: Reviewed-by: Alex Deucher <alexander.deucher@xxxxxxx> > + > /*define for cache_policy field*/ > #define SDMA_PKT_CONSTANT_FILL_HEADER_cache_policy_offset 0 > #define SDMA_PKT_CONSTANT_FILL_HEADER_cache_policy_mask 0x00000007 > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c > index a8763496aed3..573fb6a56f8b 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c > @@ -1724,7 +1724,8 @@ static void sdma_v7_0_emit_fill_buffer(struct amdgpu_ib *ib, > uint64_t dst_offset, > uint32_t byte_count) > { > - ib->ptr[ib->length_dw++] = SDMA_PKT_COPY_LINEAR_HEADER_OP(SDMA_OP_CONST_FILL); > + ib->ptr[ib->length_dw++] = SDMA_PKT_CONSTANT_FILL_HEADER_OP(SDMA_OP_CONST_FILL) | > + SDMA_PKT_CONSTANT_FILL_HEADER_COMPRESS(1); > ib->ptr[ib->length_dw++] = lower_32_bits(dst_offset); > ib->ptr[ib->length_dw++] = upper_32_bits(dst_offset); > ib->ptr[ib->length_dw++] = src_data; > -- > 2.43.0 > > From: Frank Min <Frank.Min@xxxxxxx> > Date: Wed, 16 Oct 2024 14:06:01 +0800 > Subject: [PATCH 2/2] drm/amdgpu: fix typo for sdma6 constant fill packet > > Fix typo for sdma6 constant fill packet > > Signed-off-by: Frank Min <Frank.Min@xxxxxxx> Reviewed-by: Alex Deucher <alexander.deucher@xxxxxxx> > --- > drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c > index 208a1fa9d4e7..cb07327d294f 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c > @@ -1726,7 +1726,7 @@ static void sdma_v6_0_emit_fill_buffer(struct amdgpu_ib *ib, > uint64_t dst_offset, > uint32_t byte_count) > { > - ib->ptr[ib->length_dw++] = SDMA_PKT_COPY_LINEAR_HEADER_OP(SDMA_OP_CONST_FILL); > + ib->ptr[ib->length_dw++] = SDMA_PKT_CONSTANT_FILL_HEADER_OP(SDMA_OP_CONST_FILL); > ib->ptr[ib->length_dw++] = lower_32_bits(dst_offset); > ib->ptr[ib->length_dw++] = upper_32_bits(dst_offset); > ib->ptr[ib->length_dw++] = src_data; > -- > 2.43.0 > > -----Original Message----- > From: Lazar, Lijo <Lijo.Lazar@xxxxxxx> > Sent: Wednesday, October 16, 2024 12:41 PM > To: Min, Frank <Frank.Min@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Zhang, Hawking <Hawking.Zhang@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>; Gao, Likun <Likun.Gao@xxxxxxx>; Olsak, Marek <Marek.Olsak@xxxxxxx> > Subject: Re: [PATCH] drm/amdgpu: fix random data corruption for sdma 7 > > > > On 10/16/2024 8:29 AM, Min, Frank wrote: > > [AMD Official Use Only - AMD Internal Distribution Only] > > > > [AMD Official Use Only - AMD Internal Distribution Only] > > > > From: Frank Min <Frank.Min@xxxxxxx> > > > > There is random data corruption caused by const fill, this is caused by write compression mode not correclt configured. > > > > So correct compression mode for const fill. > > > > Signed-off-by: Frank Min <Frank.Min@xxxxxxx> > > --- > > drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h | 5 ++++- > > drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 3 ++- > > 2 files changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h > > b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h > > index d8cf830916b9..18e73057e6ba 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h > > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0_0_pkt_open.h > > @@ -116,11 +116,14 @@ > > #define SDMA_PKT_COPY_LINEAR_HEADER_sub_op_shift 8 #define > > SDMA_PKT_COPY_LINEAR_HEADER_SUB_OP(x) (((x) & > > SDMA_PKT_COPY_LINEAR_HEADER_sub_op_mask) << > > SDMA_PKT_COPY_LINEAR_HEADER_sub_op_shift) > > > > -/*define for encrypt field*/ > > +/*define for encrypt/compress field*/ > > #define SDMA_PKT_COPY_LINEAR_HEADER_encrypt_offset 0 > > #define SDMA_PKT_COPY_LINEAR_HEADER_encrypt_mask 0x00000001 > > #define SDMA_PKT_COPY_LINEAR_HEADER_encrypt_shift 16 > > +/*sdma6 use bit 16 for data encryption*/ > > #define SDMA_PKT_COPY_LINEAR_HEADER_ENCRYPT(x) (((x) & > > SDMA_PKT_COPY_LINEAR_HEADER_encrypt_mask) << > > SDMA_PKT_COPY_LINEAR_HEADER_encrypt_shift) > > +/*sdma7 use bit 16 for dcc compression*/ #define > > +SDMA_PKT_COPY_LINEAR_HEADER_COMPRESS(x) (((x) & > > +SDMA_PKT_COPY_LINEAR_HEADER_encrypt_mask) << > > +SDMA_PKT_COPY_LINEAR_HEADER_encrypt_shift) > > > > This doesn't look like the right way. The confusion is there because it uses SDMA_PKT_COPY_LINEAR_HEADER_OP instead of SDMA_PKT_CONSTANT_FILL_HEADER_OP. > > It's better to > > 1) Bring in sdma 7 packet header definition (Mixing with SDMA 6 is also not good) > > 2) Use SDMA_PKT_CONSTANT_FILL_HEADER_OP instead of SDMA_PKT_COPY_LINEAR_HEADER_OP > > 3) Use the correct bit mask/shift as defined in sdma 7 packet header definition for const_fill. > > Thanks, > Lijo > > > /*define for tmz field*/ > > #define SDMA_PKT_COPY_LINEAR_HEADER_tmz_offset 0 diff --git > > a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c > > b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c > > index a8763496aed3..9181579ae6a1 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c > > @@ -1724,7 +1724,8 @@ static void sdma_v7_0_emit_fill_buffer(struct amdgpu_ib *ib, > > uint64_t dst_offset, > > uint32_t byte_count) { > > - ib->ptr[ib->length_dw++] = SDMA_PKT_COPY_LINEAR_HEADER_OP(SDMA_OP_CONST_FILL); > > + ib->ptr[ib->length_dw++] = SDMA_PKT_COPY_LINEAR_HEADER_OP(SDMA_OP_CONST_FILL) | > > + SDMA_PKT_COPY_LINEAR_HEADER_COMPRESS(1); > > ib->ptr[ib->length_dw++] = lower_32_bits(dst_offset); > > ib->ptr[ib->length_dw++] = upper_32_bits(dst_offset); > > ib->ptr[ib->length_dw++] = src_data; > > -- > > 2.43.0 > >