[AMD Official Use Only - AMD Internal Distribution Only] Hi Lijo, Yes, you're right. The WREG32_SOC15_RLC/WREG32_SOC15 implementation are not correct. Actually even didn't select right xcc_id for KIQ access. In amdgpu_sriov_wreg/rreg function, we should add normalization handling for GC_HWIP access. It can cover both amdgpu_kiq_r/wreg/ and amdgpu_virt_rlcg_reg_rw cases. For amdgpu_gmc_fw_reg_write_reg_wait case, do you think should we add a GC/MMHUB flag as parameter and do normalization for GC in this function, or do normalization on caller function gmc_v9_0_flush_gpu_tlb. Thanks, HaiJun -----Original Message----- From: Lazar, Lijo <Lijo.Lazar@xxxxxxx> Sent: Monday, June 17, 2024 11:35 AM To: Chang, HaiJun <HaiJun.Chang@xxxxxxx>; Jian, Jane <Jane.Jian@xxxxxxx>; Zhao, Victor <Victor.Zhao@xxxxxxx> Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx Subject: Re: [PATCH] drm/amdgpu: keep init xcc0 for all xccs under sriov On 6/17/2024 8:58 AM, Chang, HaiJun wrote: > [AMD Official Use Only - AMD Internal Distribution Only] > > > Hi Lijo, > > > > Right, 18bits are byte aligned range of local XCC register, 16bites > are dword aligned offset range > > > > We find the normalization needs to be applied to many functions, like > > * KIQ: amdgpu_kiq_r/wreg/ > * RLC: amdgpu_virt_rlcg_reg_rw > * KIQ: amdgpu_gmc_fw_reg_write_reg_wait > * KIQ: > > amdgpu_ring_emit_reg_write_reg_wait/amdgpu_ring_emit_reg_wait/amdgpu_r > ing_emit_wreg > > > > For sriov gfx register access, it only has 2 ways: rlc or kiq. Both > of the ways can use local xcc offset, so we think it’s simpler change > to init the gfx register offsets with local xcc offset only. > Ok, is this the only place? What about other calls in gfx_v9_4_3 like WREG32_SOC15_RLC/WREG32_SOC15 etc.? Thanks, Lijo > > > Thanks, > > HaiJun > > > > *From:*Lazar, Lijo <Lijo.Lazar@xxxxxxx> > *Sent:* Saturday, June 15, 2024 10:09 AM > *To:* Jian, Jane <Jane.Jian@xxxxxxx>; Chang, HaiJun > <HaiJun.Chang@xxxxxxx>; Zhao, Victor <Victor.Zhao@xxxxxxx> > *Cc:* amd-gfx@xxxxxxxxxxxxxxxxxxxxx > *Subject:* Re: [PATCH] drm/amdgpu: keep init xcc0 for all xccs under > sriov > > > > [AMD Official Use Only - AMD Internal Distribution Only] > > > > Never mind, bit 16 and above is probably because of dword aligned offset. > > > > Any reason not to do this in kiq/rlc based writes to normalise all? > > > > Thanks, > > Lijo > > ---------------------------------------------------------------------- > -- > > *From:*Lazar, Lijo > *Sent:* Friday, June 14, 2024 5:20:30 PM > *To:* Jian, Jane <Jane.Jian@xxxxxxx <mailto:Jane.Jian@xxxxxxx>>; > Chang, HaiJun <HaiJun.Chang@xxxxxxx <mailto:HaiJun.Chang@xxxxxxx>>; > Zhao, Victor <Victor.Zhao@xxxxxxx <mailto:Victor.Zhao@xxxxxxx>> > *Cc:* amd-gfx@xxxxxxxxxxxxxxxxxxxxx > <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx> <amd-gfx@xxxxxxxxxxxxxxxxxxxxx > <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx>> > *Subject:* Re: [PATCH] drm/amdgpu: keep init xcc0 for all xccs under > sriov > > > > > > On 6/14/2024 4:40 PM, Jane Jian wrote: >> [WHY] >> sriov has the higher bit violation when flushing tlb >> >> [HOW] >> for sriov only init XCC0(lower 16-bit) for all XCCs to avoid higher >> bit violation since kiq ring is always local, local address without >> XCC ID is enough to be sent to the XCC KIQ >> > > The description is incorrect. > > Bits 18:20 represent xcc id. To guarantee all paths pass a local > address, you should just strip bits 18:20 in kiq/rlcg read/write > functions rather than here. > > Thanks, > Lijo > >> Signed-off-by: Jane Jian <Jane.Jian@xxxxxxx >><mailto:Jane.Jian@xxxxxxx>> >> --- >> drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c | 23 >>+++++++++++++++-------- >> 1 file changed, 15 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c >> b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c >> index e14acab5cceb..4e38a66a52f4 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c >> +++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c >> @@ -537,29 +537,36 @@ static void gfxhub_v1_2_xcc_init(struct >>amdgpu_device *adev, uint32_t xcc_mask) >> { >> struct amdgpu_vmhub *hub; >> int i; >> + uint32_t gc_index; >> >> for_each_inst(i, xcc_mask) { >> hub = &adev->vmhub[AMDGPU_GFXHUB(i)]; >> >> + /* for sriov only init XCC0(lower 16-bit) to avoid >> +higher bit violation */ >> + if (amdgpu_sriov_vf(adev)) >> + gc_index = 0; >> + else >> + gc_index = GET_INST(GC, i); >> + >> hub->ctx0_ptb_addr_lo32 = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), >> + SOC15_REG_OFFSET(GC, gc_index, >> >>regVM_CONTEXT0_PAGE_TABLE_BASE_ADDR_LO32); >> hub->ctx0_ptb_addr_hi32 = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), >> + SOC15_REG_OFFSET(GC, gc_index, >> >>regVM_CONTEXT0_PAGE_TABLE_BASE_ADDR_HI32); >> hub->vm_inv_eng0_sem = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), >>regVM_INVALIDATE_ENG0_SEM); >> + SOC15_REG_OFFSET(GC, gc_index, >> +regVM_INVALIDATE_ENG0_SEM); >> hub->vm_inv_eng0_req = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), >>regVM_INVALIDATE_ENG0_REQ); >> + SOC15_REG_OFFSET(GC, gc_index, >> +regVM_INVALIDATE_ENG0_REQ); >> hub->vm_inv_eng0_ack = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), >>regVM_INVALIDATE_ENG0_ACK); >> + SOC15_REG_OFFSET(GC, gc_index, >> +regVM_INVALIDATE_ENG0_ACK); >> hub->vm_context0_cntl = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), >>regVM_CONTEXT0_CNTL); >> + SOC15_REG_OFFSET(GC, gc_index, >> +regVM_CONTEXT0_CNTL); >> hub->vm_l2_pro_fault_status = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), >> + SOC15_REG_OFFSET(GC, gc_index, >> regVM_L2_PROTECTION_FAULT_STATUS); >> hub->vm_l2_pro_fault_cntl = >> - SOC15_REG_OFFSET(GC, GET_INST(GC, i), >>regVM_L2_PROTECTION_FAULT_CNTL); >> + SOC15_REG_OFFSET(GC, gc_index, >> +regVM_L2_PROTECTION_FAULT_CNTL); >> >> hub->ctx_distance = regVM_CONTEXT1_CNTL - >> regVM_CONTEXT0_CNTL; >