Hi Christian, Thanks again for your review. And for the mask change my understanding is it is to be used for mark different part of fw (1<<24 is for stack and 2<<24 is for the data). And more detailed background would need Leo give us. Best Regards, Frank -----Original Message----- From: Koenig, Christian Sent: Wednesday, November 22, 2017 3:57 PM To: Min, Frank <Frank.Min at amd.com>; amd-gfx at lists.freedesktop.org; Liu, Leo <Leo.Liu at amd.com> Subject: Re: [PATCH] drm/amdgpu: correct vce4.0 fw config for SRIOV (V2) Hi Frank, thanks, the patch looks much better now. > The masks using is to programming the stack and data part for vce fw. And this part of code is borrowed from the non-sriov sequences. In this case Leo can you explain this strange masks used for the VCE_VCPU_CACHE_OFFSET* registers? > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET0), > - offset & 0x7FFFFFFF); > + offset & ~0x0f000000); ... > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET1), > - offset & 0x7FFFFFFF); > + (offset & ~0x0f000000) | (1 << 24)); Using ~0x0f000000 looks really odd here and what should the "| (1 << 24)" part be about? Thanks, Christian. Am 22.11.2017 um 06:11 schrieb Min, Frank: > Hi Christian, > Patch updated according to your suggestions. > The masks using is to programming the stack and data part for vce fw. And this part of code is borrowed from the non-sriov sequences. > > Best Regards, > Frank > > 1. program vce 4.0 fw with 48 bit address 2. correct vce 4.0 fw stack > and date offset > > Change-Id: Ic1bc49c21d3a90c477d11162f9d6d9e2073fbbd3 > Signed-off-by: Frank Min <Frank.Min at amd.com> > --- > drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 38 +++++++++++++++++++++++------------ > 1 file changed, 25 insertions(+), 13 deletions(-) mode change > 100644 => 100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c > > diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c > b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c > old mode 100644 > new mode 100755 > index 7574554..024a1be > --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c > @@ -243,37 +243,49 @@ static int vce_v4_0_sriov_start(struct amdgpu_device *adev) > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_LMI_VM_CTRL), 0); > > if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) { > - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), > - adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8); > - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR1), > - adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8); > - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR2), > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), > adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8); > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + mmVCE_LMI_VCPU_CACHE_64BIT_BAR0), > + (adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) & > +0xff); > } else { > - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), > adev->vce.gpu_addr >> 8); > - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR1), > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + mmVCE_LMI_VCPU_CACHE_64BIT_BAR0), > + (adev->vce.gpu_addr >> 40) & 0xff); > + } > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + mmVCE_LMI_VCPU_CACHE_40BIT_BAR1), > adev->vce.gpu_addr >> 8); > - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR2), > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + mmVCE_LMI_VCPU_CACHE_64BIT_BAR1), > + (adev->vce.gpu_addr >> 40) & 0xff); > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + mmVCE_LMI_VCPU_CACHE_40BIT_BAR2), > adev->vce.gpu_addr >> 8); > - } > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + mmVCE_LMI_VCPU_CACHE_64BIT_BAR2), > + (adev->vce.gpu_addr >> 40) & 0xff); > > offset = AMDGPU_VCE_FIRMWARE_OFFSET; > size = VCE_V4_0_FW_SIZE; > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET0), > - offset & 0x7FFFFFFF); > + offset & ~0x0f000000); > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_VCPU_CACHE_SIZE0), size); > > - offset += size; > + offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ? offset > ++ size : 0; > size = VCE_V4_0_STACK_SIZE; > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET1), > - offset & 0x7FFFFFFF); > + (offset & ~0x0f000000) | (1 << 24)); > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_VCPU_CACHE_SIZE1), size); > > offset += size; > size = VCE_V4_0_DATA_SIZE; > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET2), > - offset & 0x7FFFFFFF); > + (offset & ~0x0f000000) | (2 << 24)); > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_VCPU_CACHE_SIZE2), size); > > MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_LMI_CTRL2), ~0x100, 0); > -- > 1.9.1 > > -----é?®ä»¶å??件----- > å??件人: Christian König [mailto:ckoenig.leichtzumerken at gmail.com] > å??é??æ?¶é?´: 2017å¹´11æ??21æ?¥ 22:45 > æ?¶ä»¶äºº: Min, Frank <Frank.Min at amd.com>; amd-gfx at lists.freedesktop.org > 主é¢?: Re: [PATCH] drm/amdgpu: correct vce4.0 fw config for SRIOV (V2) > > Am 21.11.2017 um 11:23 schrieb Frank Min: >> 1. program vce 4.0 fw with 48 bit address 2. correct vce 4.0 fw stack >> and date offset >> >> Change-Id: I835f3f52f3b29f996812a3948aabede9f2d9b056 >> Signed-off-by: Frank Min <Frank.Min at amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 97 ++++++++++++++++++++++------------- >> 1 file changed, 62 insertions(+), 35 deletions(-) >> mode change 100644 => 100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >> old mode 100644 >> new mode 100755 >> index 7574554..dc7b615 >> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >> @@ -243,59 +243,86 @@ static int vce_v4_0_sriov_start(struct amdgpu_device *adev) >> MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_LMI_VM_CTRL), 0); >> >> if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) { >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), >> - adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8); >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR1), >> - adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8); >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR2), >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), >> adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8); >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_LMI_VCPU_CACHE_64BIT_BAR0), >> + (adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) & >> +0xff); >> } else { >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), >> adev->vce.gpu_addr >> 8); >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR1), >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_LMI_VCPU_CACHE_64BIT_BAR0), >> + (adev->vce.gpu_addr >> 40) & 0xff); >> + } >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_LMI_VCPU_CACHE_40BIT_BAR1), >> adev->vce.gpu_addr >> 8); >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR2), >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_LMI_VCPU_CACHE_64BIT_BAR1), >> + (adev->vce.gpu_addr >> 40) & 0xff); >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_LMI_VCPU_CACHE_40BIT_BAR2), >> adev->vce.gpu_addr >> 8); >> - } >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_LMI_VCPU_CACHE_64BIT_BAR2), >> + (adev->vce.gpu_addr >> 40) & 0xff); >> >> offset = AMDGPU_VCE_FIRMWARE_OFFSET; >> size = VCE_V4_0_FW_SIZE; >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET0), >> - offset & 0x7FFFFFFF); >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_SIZE0), size); >> - >> - offset += size; >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_VCPU_CACHE_OFFSET0), >> + offset & ~0x0f000000); >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_VCPU_CACHE_SIZE0), size); >> + >> + offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ? >> + offset + size : 0; >> size = VCE_V4_0_STACK_SIZE; >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET1), >> - offset & 0x7FFFFFFF); >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_SIZE1), size); >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_VCPU_CACHE_OFFSET1), >> + (offset & ~0x0f000000) | (1 << 24)); > That mask still looks incorrect to me. > >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_VCPU_CACHE_SIZE1), size); >> >> offset += size; >> size = VCE_V4_0_DATA_SIZE; >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET2), >> - offset & 0x7FFFFFFF); >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_SIZE2), size); >> - >> - MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_CTRL2), ~0x100, 0); >> - MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_SYS_INT_EN), >> - VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK, >> - VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK); >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_VCPU_CACHE_OFFSET2), >> + (offset & ~0x0f000000) | (2 << 24)); > Dito. > >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_VCPU_CACHE_SIZE2), size); >> + >> + MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_LMI_CTRL2), ~0x100, 0); >> + MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_SYS_INT_EN), >> + VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK, >> + VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK); >> >> /* end of MC_RESUME */ >> - MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_STATUS), >> - VCE_STATUS__JOB_BUSY_MASK, ~VCE_STATUS__JOB_BUSY_MASK); >> - MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CNTL), >> - ~0x200001, VCE_VCPU_CNTL__CLK_EN_MASK); >> - MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_SOFT_RESET), >> - ~VCE_SOFT_RESET__ECPU_SOFT_RESET_MASK, 0); >> + MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_STATUS), >> + VCE_STATUS__JOB_BUSY_MASK, >> + ~VCE_STATUS__JOB_BUSY_MASK); >> + MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_VCPU_CNTL), >> + ~0x200001, >> + VCE_VCPU_CNTL__CLK_EN_MASK); >> + MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_SOFT_RESET), >> + ~VCE_SOFT_RESET__ECPU_SOFT_RESET_MASK, 0); > Unrelated coding style change, please concentrate on the functional change for this patch. > >> >> MMSCH_V1_0_INSERT_DIRECT_POLL(SOC15_REG_OFFSET(VCE, 0, mmVCE_STATUS), >> - VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK, >> - VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK); >> + VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK, >> + VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK); > Here the indentation is wrong. Looks like it was correct before the change. > > Regards, > Christian. > >> >> /* clear BUSY flag */ >> - MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_STATUS), >> - ~VCE_STATUS__JOB_BUSY_MASK, 0); >> + MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_STATUS), >> + ~VCE_STATUS__JOB_BUSY_MASK, 0); >> >> /* add end packet */ >> memcpy((void *)init_table, &end, sizeof(struct >> mmsch_v1_0_cmd_end)); >