Re: [PATCH] drm/amdgpu: Add GDS clearing workaround in later init for gfx9

"Shamis, Leonid" <Leonid.Shamis@xxxxxxx> · Fri, 7 Jun 2019 20:16:26 +0000

James,

Do you set GDS_VMID0_BASE to 0?.. I don't see it in your patch.

Regards,
Leonid

On 2019-06-07, 15:42, "Zhu, James" <James.Zhu@xxxxxxx> wrote:

    On 2019-06-07 3:12 p.m., Zhu, James wrote:
    > On 2019-06-07 2:16 p.m., Alex Deucher wrote:
    >> On Fri, Jun 7, 2019 at 12:38 PM Zhu, James <James.Zhu@xxxxxxx> wrote:
    >>> Since Hardware bug, GDS exist ECC error after cold boot up,
    >>> adding GDS clearing workaround in later init for gfx9.
    >>>
    >>> Signed-off-by: James Zhu <James.Zhu@xxxxxxx>
    >>> ---
    >>>    drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 48 +++++++++++++++++++++++++++++++++++
    >>>    1 file changed, 48 insertions(+)
    >>>
    >>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
    >>> index 76722fc..81f6ba8 100644
    >>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
    >>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
    >>> @@ -3634,6 +3634,50 @@ static const struct soc15_reg_entry sec_ded_counter_registers[] = {
    >>>       { SOC15_REG_ENTRY(GC, 0, mmSQC_EDC_CNT3), 0, 4, 6},
    >>>    };
    >>>
    >>> +
    >>> +static int gfx_v9_0_do_edc_gds_workarounds(struct amdgpu_device *adev)
    >>> +{
    >>> +       struct amdgpu_ring *ring = &adev->gfx.compute_ring[0];
    >>> +       int r;
    >>> +
    >>> +       r = amdgpu_ring_alloc(ring, 17);
    >>> +       if (r) {
    >>> +               DRM_ERROR("amdgpu: GDS workarounds failed to lock ring %s (%d).\n",
    >>> +                       ring->name, r);
    >>> +               return r;
    >>> +       }
    >>> +
    >>> +       amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3));
    >>> +       amdgpu_ring_write(ring, WRITE_DATA_ENGINE_SEL(0) |
    >>> +                          WRITE_DATA_DST_SEL(0));
    >>> +       amdgpu_ring_write(ring, SOC15_REG_OFFSET(GC, 0, mmGDS_VMID0_SIZE));
    >>> +       amdgpu_ring_write(ring, 0);
    >>> +       amdgpu_ring_write(ring, 0x10000);
    >> hardcoded size, please use the size from the driver.
    >>
    >>> +
    >>> +       amdgpu_ring_write(ring, PACKET3(PACKET3_DMA_DATA, 5));
    >>> +       amdgpu_ring_write(ring, (PACKET3_DMA_DATA_CP_SYNC |
    >>> +                               PACKET3_DMA_DATA_DST_SEL(1) |
    >>> +                               PACKET3_DMA_DATA_SRC_SEL(2) |
    >>> +                               PACKET3_DMA_DATA_ENGINE(0)));
    >>> +       amdgpu_ring_write(ring, 0);
    >>> +       amdgpu_ring_write(ring, 0);
    >>> +       amdgpu_ring_write(ring, 0);
    >>> +       amdgpu_ring_write(ring, 0);
    >>> +       amdgpu_ring_write(ring, PACKET3_DMA_DATA_CMD_RAW_WAIT | 0x10000);
    >> Instead of hardcoding the size, can you use the gds size from the
    >> driver (adev->gds.gds_size).
    > Hi Alex,
    >
    > Do you mean adev->gds.mem.total_size?
    >
    > But I see below operation in gfx_v9_0_ngg_init.
    >
    > adev->gds.mem.total_size -= ....
    >
    > Or you want me to add gds_size in struct amdgpu_gds?
    >
    > James

    Yeah, The amd-staging-drm-next branch has adev->gds.gds_size.

    James

    >
    >> With that fixed:
    >> Reviewed-by: Alex Deucher <alexander.deucher@xxxxxxx>
    >>
    >>> +
    >>> +       amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3));
    >>> +       amdgpu_ring_write(ring, WRITE_DATA_ENGINE_SEL(0) |
    >>> +                               WRITE_DATA_DST_SEL(0));
    >>> +       amdgpu_ring_write(ring, SOC15_REG_OFFSET(GC, 0, mmGDS_VMID0_SIZE));
    >>> +       amdgpu_ring_write(ring, 0);
    >>> +       amdgpu_ring_write(ring, 0x0);
    >>> +
    >>> +       amdgpu_ring_commit(ring);
    >>> +
    >>> +       return 0;
    >>> +}
    >>> +
    >>> +
    >>>    static int gfx_v9_0_do_edc_gpr_workarounds(struct amdgpu_device *adev)
    >>>    {
    >>>           struct amdgpu_ring *ring = &adev->gfx.compute_ring[0];
    >>> @@ -3810,6 +3854,10 @@ static int gfx_v9_0_ecc_late_init(void *handle)
    >>>                   return 0;
    >>>           }
    >>>
    >>> +       r = gfx_v9_0_do_edc_gds_workarounds(adev);
    >>> +       if (r)
    >>> +               return r;
    >>> +
    >>>           /* requires IBs so do in late init after IB pool is initialized */
    >>>           r = gfx_v9_0_do_edc_gpr_workarounds(adev);
    >>>           if (r)
    >>> --
    >>> 2.7.4
    >>>
    >>> _______________________________________________
    >>> amd-gfx mailing list
    >>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx
    >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx