It's correct and already working on vega10/tonga for days, In fact the guilty context already works at my side BR Monk -----Original Message----- From: Christian König [mailto:deathsimple@xxxxxxxxxxx] Sent: Wednesday, May 03, 2017 5:02 PM To: Liu, Monk <Monk.Liu at amd.com>; amd-gfx at lists.freedesktop.org Subject: Re: [PATCH 3/6] drm/amdgpu:re-write sriov_reinit_early/late Am 03.05.2017 um 05:48 schrieb Monk Liu: > 1,this way we make those routines compatible with the sequence > requirment for both Tonga and Vega10 2,ignore PSP hw init when > doing TDR, because for SR-IOV device the ucode won't get lost after VF > FLR, so no need to invoke PSP doing the ucode reloading again. > > Signed-off-by: Monk Liu <Monk.Liu at amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 63 ++++++++++++++++++------------ > 1 file changed, 39 insertions(+), 24 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 5161c20..5573792 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -1718,19 +1718,27 @@ static int amdgpu_sriov_reinit_early(struct amdgpu_device *adev) > { > int i, r; > > - for (i = 0; i < adev->num_ip_blocks; i++) { > - if (!adev->ip_blocks[i].status.valid) > - continue; > - > - if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_COMMON || > - adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC || > - adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_IH) > - r = adev->ip_blocks[i].version->funcs->hw_init(adev); > + static enum amd_ip_block_type ip_order[] = { > + AMD_IP_BLOCK_TYPE_GMC, > + AMD_IP_BLOCK_TYPE_COMMON, > + AMD_IP_BLOCK_TYPE_GFXHUB, > + AMD_IP_BLOCK_TYPE_MMHUB, > + AMD_IP_BLOCK_TYPE_IH, > + }; > + > + for (i = 0; i < sizeof(ip_order)/sizeof(ip_order[0]); i++) { You should use ARRAY_SIZE here instead. > + int j; > + struct amdgpu_ip_block *block; > + > + for (j = 0; j < adev->num_ip_blocks; j++) { > + block = &adev->ip_blocks[j]; > + > + if (block->version->type != ip_order[i] || > + !block->status.valid) > + continue; > > - if (r) { > - DRM_ERROR("resume of IP block <%s> failed %d\n", > - adev->ip_blocks[i].version->funcs->name, r); > - return r; > + r = block->version->funcs->hw_init(adev); > + DRM_INFO("RE-INIT: %s %s\n", block->version->funcs->name, > +r?"failed":"successed"); > } > } > > @@ -1741,20 +1749,27 @@ static int amdgpu_sriov_reinit_late(struct amdgpu_device *adev) > { > int i, r; > > - for (i = 0; i < adev->num_ip_blocks; i++) { > - if (!adev->ip_blocks[i].status.valid) > - continue; > + static enum amd_ip_block_type ip_order[] = { > + AMD_IP_BLOCK_TYPE_SMC, > + AMD_IP_BLOCK_TYPE_DCE, > + AMD_IP_BLOCK_TYPE_GFX, > + AMD_IP_BLOCK_TYPE_SDMA, > + AMD_IP_BLOCK_TYPE_VCE, > + }; > > - if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_COMMON || > - adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC || > - adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_IH ) > - continue; > + for (i = 0; i < sizeof(ip_order)/sizeof(ip_order[0]); i++) { And here as well. > + int j; > + struct amdgpu_ip_block *block; > > - r = adev->ip_blocks[i].version->funcs->hw_init(adev); > - if (r) { > - DRM_ERROR("resume of IP block <%s> failed %d\n", > - adev->ip_blocks[i].version->funcs->name, r); > - return r; > + for (j = 0; j < adev->num_ip_blocks; j++) { > + block = &adev->ip_blocks[j]; > + > + if (block->version->type != ip_order[i] || > + !block->status.valid) > + continue; > + > + r = block->version->funcs->hw_init(adev); > + DRM_INFO("RE-INIT: %s %s\n", block->version->funcs->name, > +r?"failed":"successed"); This changes the order in which blocks are initialized which is probably not correct. Alex needs to take a look at this, but we clearly need to improve the handling here. Regards, Christian. > } > } >