This is a note to let you know that I've just added the patch titled drm/amdgpu: Don't resume IOMMU after incomplete init to the 6.1-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: drm-amdgpu-don-t-resume-iommu-after-incomplete-init.patch and it can be found in the queue-6.1 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From f3921a9a641483784448fb982b2eb738b383d9b9 Mon Sep 17 00:00:00 2001 From: Felix Kuehling <Felix.Kuehling@xxxxxxx> Date: Mon, 13 Mar 2023 20:03:08 -0400 Subject: drm/amdgpu: Don't resume IOMMU after incomplete init From: Felix Kuehling <Felix.Kuehling@xxxxxxx> commit f3921a9a641483784448fb982b2eb738b383d9b9 upstream. Check kfd->init_complete in kgd2kfd_iommu_resume, consistent with other kgd2kfd calls. This should fix IOMMU errors on resume from suspend when KFD IOMMU initialization failed. Reported-by: Matt Fagnani <matt.fagnani@xxxxxxxx> Link: https://lore.kernel.org/r/4a3b225c-2ffd-e758-4de1-447375e34cad@xxxxxxxx/ Link: https://bugzilla.kernel.org/show_bug.cgi?id=217170 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2454 Cc: Vasant Hegde <vasant.hegde@xxxxxxx> Cc: Linux regression tracking (Thorsten Leemhuis) <regressions@xxxxxxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx Signed-off-by: Felix Kuehling <Felix.Kuehling@xxxxxxx> Acked-by: Alex Deucher <alexander.deucher@xxxxxxx> Tested-by: Matt Fagnani <matt.fagnani@xxxxxxxx> Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -59,6 +59,7 @@ static int kfd_gtt_sa_init(struct kfd_de unsigned int chunk_size); static void kfd_gtt_sa_fini(struct kfd_dev *kfd); +static int kfd_resume_iommu(struct kfd_dev *kfd); static int kfd_resume(struct kfd_dev *kfd); static void kfd_device_info_set_sdma_info(struct kfd_dev *kfd) @@ -634,7 +635,7 @@ bool kgd2kfd_device_init(struct kfd_dev svm_migrate_init(kfd->adev); - if (kgd2kfd_resume_iommu(kfd)) + if (kfd_resume_iommu(kfd)) goto device_iommu_error; if (kfd_resume(kfd)) @@ -783,6 +784,14 @@ int kgd2kfd_resume(struct kfd_dev *kfd, int kgd2kfd_resume_iommu(struct kfd_dev *kfd) { + if (!kfd->init_complete) + return 0; + + return kfd_resume_iommu(kfd); +} + +static int kfd_resume_iommu(struct kfd_dev *kfd) +{ int err = 0; err = kfd_iommu_resume(kfd); Patches currently in stable-queue which might be from Felix.Kuehling@xxxxxxx are queue-6.1/drm-ttm-fix-a-null-pointer-dereference.patch queue-6.1/drm-amdgpu-don-t-resume-iommu-after-incomplete-init.patch queue-6.1/drm-amdkfd-fix-an-illegal-memory-access.patch