In general I suggest to unify SRIOV and bare metal handling as much as
possible.
Regards,
Christian.
Am 27.04.22 um 08:03 schrieb Haohui Mai:
Great, thanks! I'll work on a patch then.
~Haohui
On Wed, Apr 27, 2022 at 1:57 PM Christian König
<ckoenig.leichtzumerken@xxxxxxxxx> wrote:
Am 27.04.22 um 03:53 schrieb Haohui Mai:
Hi,
I'm looking at the initialization sequences in sdma_v5_2.c. I'm
confused on whether the DMA engine should be activated when updating
the MMIO registers. Some clarifications are highly appreciated.
Here is the background:
* sdma_v5_2_enable() toggles the HALT bit to enable / disable the
async DMA engine
* sdma_v5_2_resume() initializes MMIO registers (e.g., queue
addresses) of the DMA engine.
* sdma_v5_2_start() is called when the kernel initializes the device.
However, the driver has two paths when updating the MMIO registers,
where the DMA engine is activated / deactivated respectively.
When amdgpu_sriov_vf(adev) is true:
866 if (amdgpu_sriov_vf(adev)) {
867 sdma_v5_2_ctx_switch_enable(adev, false);
868 sdma_v5_2_enable(adev, false);
869
870 /* set RB registers */
871 r = sdma_v5_2_gfx_resume(adev);
872 return r;
873 }
When amdgpu_sriov_vf(adev) is false:
893 sdma_v5_2_enable(adev, true);
894 /* enable sdma ring preemption */
895 sdma_v5_2_ctx_switch_enable(adev, true);
896
897 /* start the gfx rings and rlc compute queues */
898 r = sdma_v5_2_gfx_resume(adev);
Furthermore, sdma_v5_2_gfx_resume() re-enables the already active DMA
engine when amdgpu_sriov_vf(adev) is false:
728 /* unhalt engine */
729 temp =
RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_F32_CNTL));
730 temp = REG_SET_FIELD(temp,
SDMA0_F32_CNTL, HALT, 0);
731 WREG32(sdma_v5_2_get_reg_offset(adev,
i, mmSDMA0_F32_CNTL), temp);
The behavior seems inconsistent. Looking at the code that re-enables
the engine, it seems that the driver assumes a deactivated DMA engine
during initialization regardless whether the device is in vf mode or
not.
Just wondering, is the behavior expected or is it a bug?
Off hand that sounds like a bug to me. The SRIOV code paths are in
general not that well tested since most testing/bringup happens on bare
metal.
Regards,
Christian.
Thanks,
Haohui