Thanks for the review , change the description as suggested and submitted. Shaoyun.liu -----Original Message----- From: Kuehling, Felix <Felix.Kuehling@xxxxxxx> Sent: Tuesday, November 30, 2021 1:19 AM To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Liu, Shaoyun <Shaoyun.Liu@xxxxxxx> Subject: Re: [PATCH] drm/amdgpu: adjust the kfd reset sequence in reset sriov function Am 2021-11-29 um 9:40 p.m. schrieb shaoyunl: > This change revert previous commit > 7079e7d5c6bf: drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov > cd547b93c62a: drm/amdgpu: move kfd post_reset out of reset_sriov > function It looks like this is not a straight revert. It moves the amdgpu_amdkfd_pre_reset to an earlier place in amdgpu_device_reset_sriov, presumably to address the sequence issue that the first patch was originally meant to fix. The patch description should mention that. With that fixed, the patch is Reviewed-by: Felix Kuehling <Felix.Kuehling@xxxxxxx> > > Some register access(GRBM_GFX_CNTL) only be allowed on full access > mode. Move kfd_pre_reset and kfd_post_reset back inside reset_sriov > function. > > Signed-off-by: shaoyunl <shaoyun.liu@xxxxxxx> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 1989f9e9379e..3c5afa45173c 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -4285,6 +4285,8 @@ static int amdgpu_device_reset_sriov(struct > amdgpu_device *adev, { > int r; > > + amdgpu_amdkfd_pre_reset(adev); > + > if (from_hypervisor) > r = amdgpu_virt_request_full_gpu(adev, true); > else > @@ -4312,6 +4314,7 @@ static int amdgpu_device_reset_sriov(struct > amdgpu_device *adev, > > amdgpu_irq_gpu_reset_resume_helper(adev); > r = amdgpu_ib_ring_tests(adev); > + amdgpu_amdkfd_post_reset(adev); > > error: > if (!r && adev->virt.gim_feature & AMDGIM_FEATURE_GIM_FLR_VRAMLOST) > { @@ -5026,7 +5029,8 @@ int amdgpu_device_gpu_recover(struct > amdgpu_device *adev, > > cancel_delayed_work_sync(&tmp_adev->delayed_init_work); > > - amdgpu_amdkfd_pre_reset(tmp_adev); > + if (!amdgpu_sriov_vf(tmp_adev)) > + amdgpu_amdkfd_pre_reset(tmp_adev); > > /* > * Mark these ASICs to be reseted as untracked first @@ -5144,9 > +5148,9 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, > > skip_sched_resume: > list_for_each_entry(tmp_adev, device_list_handle, reset_list) { > - /* unlock kfd */ > - if (!need_emergency_restart) > - amdgpu_amdkfd_post_reset(tmp_adev); > + /* unlock kfd: SRIOV would do it separately */ > + if (!need_emergency_restart && !amdgpu_sriov_vf(tmp_adev)) > + amdgpu_amdkfd_post_reset(tmp_adev); > > /* kfd_post_reset will do nothing if kfd device is not initialized, > * need to bring up kfd here if it's not be initialized before