Iâ??m not for sure about this case you talked about. Assume that it could happen and the KFD probe and init are invoked when loading it manually. For baremetal device, itâ??s always correct. For SRIOV virtual function, it doesnâ??t behave correctly with or without this patch. KFD initialization also needs to access VF in exclusive mode, while the exclusive mode request/release messages are sent in amdgpu_device_init. â?? Sincerely Yours, Pixel On 31/10/2017, 11:06 PM, "Tom Stellard" <tstellar at redhat.com> wrote: >On 10/30/2017 12:57 AM, Pixel Ding wrote: >> From: pding <Pixel.Ding at amd.com> >> >> Move kfd probe prior to device init. Release exclusive mode >> after hw_init if kfd is not enabled. >> > >What happens if only the amdgpu module is loaded at startup, and then the >user manually loads the amdkfd module at some point later on. Will the driver >still behave correctly in this case with this patch? > >-Tom > > >> Signed-off-by: pding <Pixel.Ding at amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++ >> drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 5 +++-- >> 2 files changed, 6 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> index 400dfaa..e46ec51 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -1716,6 +1716,9 @@ static int amdgpu_init(struct amdgpu_device *adev) >> adev->ip_blocks[i].status.hw = true; >> } >> >> + if (amdgpu_sriov_vf(adev) && !adev->kfd) >> + amdgpu_virt_release_full_gpu(adev, true); >> + >> return 0; >> } >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c >> index 3e9760d..e91907c 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c >> @@ -138,6 +138,8 @@ int amdgpu_driver_load_kms(struct drm_device *dev, unsigned long flags) >> !pci_is_thunderbolt_attached(dev->pdev)) >> flags |= AMD_IS_PX; >> >> + amdgpu_amdkfd_device_probe(adev); >> + >> /* amdgpu_device_init should report only fatal error >> * like memory allocation failure or iomapping failure, >> * or memory manager initialization failure, it must >> @@ -170,7 +172,6 @@ int amdgpu_driver_load_kms(struct drm_device *dev, unsigned long flags) >> "Error during ACPI methods call\n"); >> } >> >> - amdgpu_amdkfd_device_probe(adev); >> amdgpu_amdkfd_device_init(adev); >> >> if (amdgpu_device_is_px(dev)) { >> @@ -182,7 +183,7 @@ int amdgpu_driver_load_kms(struct drm_device *dev, unsigned long flags) >> pm_runtime_put_autosuspend(dev->dev); >> } >> >> - if (amdgpu_sriov_vf(adev)) >> + if (amdgpu_sriov_vf(adev) && adev->kfd) >> amdgpu_virt_release_full_gpu(adev, true); >> >> out: >> >