On 6/17/2024 8:58 AM, Chang, HaiJun wrote: > [AMD Official Use Only - AMD Internal Distribution Only] > > > Hi Lijo, > > > > Right, 18bits are byte aligned range of local XCC register, 16bites are > dword aligned offset range > > > > We find the normalization needs to be applied to many functions, like > > * KIQ: amdgpu_kiq_r/wreg/ > * RLC: amdgpu_virt_rlcg_reg_rw > * KIQ: amdgpu_gmc_fw_reg_write_reg_wait > * KIQ: > amdgpu_ring_emit_reg_write_reg_wait/amdgpu_ring_emit_reg_wait/amdgpu_ring_emit_wreg > > > > For sriov gfx register access, it only has 2 ways: rlc or kiq. Both of > the ways can use local xcc offset, so we think it’s simpler change to > init the gfx register offsets with local xcc offset only. > Ok, is this the only place? What about other calls in gfx_v9_4_3 like WREG32_SOC15_RLC/WREG32_SOC15 etc.? Thanks, Lijo > > > Thanks, > > HaiJun > > > > *From:*Lazar, Lijo <Lijo.Lazar@xxxxxxx> > *Sent:* Saturday, June 15, 2024 10:09 AM > *To:* Jian, Jane <Jane.Jian@xxxxxxx>; Chang, HaiJun > <HaiJun.Chang@xxxxxxx>; Zhao, Victor <Victor.Zhao@xxxxxxx> > *Cc:* amd-gfx@xxxxxxxxxxxxxxxxxxxxx > *Subject:* Re: [PATCH] drm/amdgpu: keep init xcc0 for all xccs under sriov > > > > [AMD Official Use Only - AMD Internal Distribution Only] > > > > Never mind, bit 16 and above is probably because of dword aligned offset. > > > > Any reason not to do this in kiq/rlc based writes to normalise all? > > > > Thanks, > > Lijo > > ------------------------------------------------------------------------ > > *From:*Lazar, Lijo > *Sent:* Friday, June 14, 2024 5:20:30 PM > *To:* Jian, Jane <Jane.Jian@xxxxxxx <mailto:Jane.Jian@xxxxxxx>>; Chang, > HaiJun <HaiJun.Chang@xxxxxxx <mailto:HaiJun.Chang@xxxxxxx>>; Zhao, > Victor <Victor.Zhao@xxxxxxx <mailto:Victor.Zhao@xxxxxxx>> > *Cc:* amd-gfx@xxxxxxxxxxxxxxxxxxxxx > <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx> <amd-gfx@xxxxxxxxxxxxxxxxxxxxx > <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx>> > *Subject:* Re: [PATCH] drm/amdgpu: keep init xcc0 for all xccs under sriov > > > > > > On 6/14/2024 4:40 PM, Jane Jian wrote: >> [WHY] >> sriov has the higher bit violation when flushing tlb >> >> [HOW] >> for sriov only init XCC0(lower 16-bit) for all XCCs to avoid higher bit violation >> since kiq ring is always local, local address without XCC ID is enough to be sent to the XCC KIQ >> > > The description is incorrect. > > Bits 18:20 represent xcc id. To guarantee all paths pass a local > address, you should just strip bits 18:20 in kiq/rlcg read/write > functions rather than here. > > Thanks, > Lijo > >> Signed-off-by: Jane Jian <Jane.Jian@xxxxxxx <mailto:Jane.Jian@xxxxxxx>> >> --- >> drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c | 23 +++++++++++++++-------- >> 1 file changed, 15 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c >> index e14acab5cceb..4e38a66a52f4 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c >> +++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c >> @@ -537,29 +537,36 @@ static void gfxhub_v1_2_xcc_init(struct amdgpu_device *adev, uint32_t xcc_mask) >> { >> struct amdgpu_vmhub *hub; >> int i; >> + uint32_t gc_index; >> >> for_each_inst(i, xcc_mask) { >> hub = &adev->vmhub[AMDGPU_GFXHUB(i)]; >> >> + /* for sriov only init XCC0(lower 16-bit) to avoid higher bit violation */ >> + if (amdgpu_sriov_vf(adev)) >> + gc_index = 0; >> + else >> + gc_index = GET_INST(GC, i); >> + >> hub->ctx0_ptb_addr_lo32 = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), >> + SOC15_REG_OFFSET(GC, gc_index, >> regVM_CONTEXT0_PAGE_TABLE_BASE_ADDR_LO32); >> hub->ctx0_ptb_addr_hi32 = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), >> + SOC15_REG_OFFSET(GC, gc_index, >> regVM_CONTEXT0_PAGE_TABLE_BASE_ADDR_HI32); >> hub->vm_inv_eng0_sem = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), regVM_INVALIDATE_ENG0_SEM); >> + SOC15_REG_OFFSET(GC, gc_index, regVM_INVALIDATE_ENG0_SEM); >> hub->vm_inv_eng0_req = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), regVM_INVALIDATE_ENG0_REQ); >> + SOC15_REG_OFFSET(GC, gc_index, regVM_INVALIDATE_ENG0_REQ); >> hub->vm_inv_eng0_ack = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), regVM_INVALIDATE_ENG0_ACK); >> + SOC15_REG_OFFSET(GC, gc_index, regVM_INVALIDATE_ENG0_ACK); >> hub->vm_context0_cntl = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), regVM_CONTEXT0_CNTL); >> + SOC15_REG_OFFSET(GC, gc_index, regVM_CONTEXT0_CNTL); >> hub->vm_l2_pro_fault_status = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), >> + SOC15_REG_OFFSET(GC, gc_index, >> regVM_L2_PROTECTION_FAULT_STATUS); >> hub->vm_l2_pro_fault_cntl = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), regVM_L2_PROTECTION_FAULT_CNTL); >> + SOC15_REG_OFFSET(GC, gc_index, regVM_L2_PROTECTION_FAULT_CNTL); >> >> hub->ctx_distance = regVM_CONTEXT1_CNTL - >> regVM_CONTEXT0_CNTL; >