Re: [PATCH v2] drm/amdgpu: add workarounds for Raven VCN TMZ issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2022-03-08 04:16, Christian König wrote:
Am 08.03.22 um 09:06 schrieb Lang Yu:
On 03/08/ , Christian König wrote:
Am 08.03.22 um 08:33 schrieb Lang Yu:
On 03/08/ , Christian König wrote:
Am 08.03.22 um 04:39 schrieb Lang Yu:
It is a hardware issue that VCN can't handle a GTT
backing stored TMZ buffer on Raven.

Move such a TMZ buffer to VRAM domain before command
submission.

v2:
    - Use patch_cs_in_place callback.

Suggested-by: Christian König <christian.koenig@xxxxxxx>
Signed-off-by: Lang Yu <Lang.Yu@xxxxxxx>
---
    drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 68 +++++++++++++++++++++++++++
    1 file changed, 68 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index 7bbb9ba6b80b..810932abd3af 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -24,6 +24,7 @@
    #include <linux/firmware.h>
    #include "amdgpu.h"
+#include "amdgpu_cs.h"
    #include "amdgpu_vcn.h"
    #include "amdgpu_pm.h"
    #include "soc15.h"
@@ -1905,6 +1906,72 @@ static const struct amd_ip_funcs vcn_v1_0_ip_funcs = {
        .set_powergating_state = vcn_v1_0_set_powergating_state,
    };
+/**
+ * It is a hardware issue that Raven VCN can't handle a GTT TMZ buffer. + * Move such a GTT TMZ buffer to VRAM domain before command submission.
+ */
+static int vcn_v1_0_validate_bo(struct amdgpu_cs_parser *parser,
+                struct amdgpu_job *job,
+                uint64_t addr)
+{
+    struct ttm_operation_ctx ctx = { false, false };
+    struct amdgpu_fpriv *fpriv = parser->filp->driver_priv;
+    struct amdgpu_vm *vm = &fpriv->vm;
+    struct amdgpu_bo_va_mapping *mapping;
+    struct amdgpu_bo *bo;
+    int r;
+
+    addr &= AMDGPU_GMC_HOLE_MASK;
+    if (addr & 0x7) {
+        DRM_ERROR("VCN messages must be 8 byte aligned!\n");
+        return -EINVAL;
+    }
+
+    mapping = amdgpu_vm_bo_lookup_mapping(vm, addr/AMDGPU_GPU_PAGE_SIZE);
+    if (!mapping || !mapping->bo_va || !mapping->bo_va->base.bo)
+        return -EINVAL;
+
+    bo = mapping->bo_va->base.bo;
+    if (!(bo->flags & AMDGPU_GEM_CREATE_ENCRYPTED))
+        return 0;
+
+    amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_VRAM);
+    r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
+    if (r) {
+        DRM_ERROR("Failed validating the VCN message BO (%d)!\n", r);
+        return r;
+    }
Well, exactly that won't work.

The message structure isn't TMZ protected because otherwise the driver won't
be able to stitch it together.

What is TMZ protected are the surfaces the message structure is pointing to. So what you would need to do is to completely parse the structure and then
move on the relevant buffers into VRAM.

Leo or James, can you help with that?
  From my observations, when decoding secure contents, register
GPCOM_VCPU_DATA0 and GPCOM_VCPU_DATA1 are set to a TMZ buffer address.
And this way works when allocating TMZ buffers in GTT domain.
As far as I remember that's only the case for the decoding, encoding works
by putting the addresses into the message buffer.

But could be that decoding is sufficient, Leo and James need to comment on
this.
It seems that only decode needs TMZ buffers. Only observe si_vid_create_tmz_buffer() was called in rvcn_dec_message_decode() in src/gallium/drivers/radeon/radeon_vcn_dec.c.

Mhm, good point. Let's wait for Leo and James to wake up, when we don't need encode support than that would makes things much easier.

For secure playback, the buffer required in TMZ are dpb, dt and ctx, for the rest esp. those for CPU access don't need that E.g. msg buffer, and bitstream buffer.

From radeon_vcn_dec.c, you can see the buffer for dpb and ctx, and dt buffer frontend/va/surface is set to PIPE_BIND_PROTECTED.


Regards,

Leo




Regards,
Christian.


Regards,
Lang

Regards,
Christian.

Regards,
Lang

Regards,
Christian.

+
+    return r;
+}
+
+static int vcn_v1_0_ring_patch_cs_in_place(struct amdgpu_cs_parser *p,
+                       struct amdgpu_job *job,
+                       struct amdgpu_ib *ib)
+{
+    uint32_t msg_lo = 0, msg_hi = 0;
+    int i, r;
+
+    for (i = 0; i < ib->length_dw; i += 2) {
+        uint32_t reg = amdgpu_ib_get_value(ib, i);
+        uint32_t val = amdgpu_ib_get_value(ib, i + 1);
+
+        if (reg == PACKET0(p->adev->vcn.internal.data0, 0)) {
+            msg_lo = val;
+        } else if (reg == PACKET0(p->adev->vcn.internal.data1, 0)) {
+            msg_hi = val;
+        } else if (reg == PACKET0(p->adev->vcn.internal.cmd, 0)) {
+            r = vcn_v1_0_validate_bo(p, job,
+                         ((u64)msg_hi) << 32 | msg_lo);
+            if (r)
+                return r;
+        }
+    }
+
+    return 0;
+}
+
+
    static const struct amdgpu_ring_funcs vcn_v1_0_dec_ring_vm_funcs = {
        .type = AMDGPU_RING_TYPE_VCN_DEC,
        .align_mask = 0xf,
@@ -1914,6 +1981,7 @@ static const struct amdgpu_ring_funcs vcn_v1_0_dec_ring_vm_funcs = {
        .get_rptr = vcn_v1_0_dec_ring_get_rptr,
        .get_wptr = vcn_v1_0_dec_ring_get_wptr,
        .set_wptr = vcn_v1_0_dec_ring_set_wptr,
+    .patch_cs_in_place = vcn_v1_0_ring_patch_cs_in_place,
        .emit_frame_size =
            6 + 6 + /* hdp invalidate / flush */
            SOC15_FLUSH_GPU_TLB_NUM_WREG * 6 +




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux