Patch "drm/amdgpu: Init zone device and drm client after mode-1 reset on reload" has been added to the 6.8-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    drm/amdgpu: Init zone device and drm client after mode-1 reset on reload

to the 6.8-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     drm-amdgpu-init-zone-device-and-drm-client-after-mod.patch
and it can be found in the queue-6.8 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 7e9f604a76fc0a9616429b217b1bc372e2932dba
Author: Ahmad Rehman <Ahmad.Rehman@xxxxxxx>
Date:   Mon Mar 4 15:56:00 2024 -0600

    drm/amdgpu: Init zone device and drm client after mode-1 reset on reload
    
    [ Upstream commit f679fd6057fbf5ab34aaee28d58b7f81af0cbf48 ]
    
    In passthrough environment, when amdgpu is reloaded after unload, mode-1
    is triggered after initializing the necessary IPs, That init does not
    include KFD, and KFD init waits until the reset is completed. KFD init
    is called in the reset handler, but in this case, the zone device and
    drm client is not initialized, causing app to create kernel panic.
    
    v2: Removing the init KFD condition from amdgpu_amdkfd_drm_client_create.
    As the previous version has the potential of creating DRM client twice.
    
    v3: v2 patch results in SDMA engine hung as DRM open causes VM clear to SDMA
    before SDMA init. Adding the condition to in drm client creation, on top of v1,
    to guard against drm client creation call multiple times.
    
    Signed-off-by: Ahmad Rehman <Ahmad.Rehman@xxxxxxx>
    Reviewed-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>
    Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 41db030ddc4ee..131983ed43465 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -146,7 +146,7 @@ int amdgpu_amdkfd_drm_client_create(struct amdgpu_device *adev)
 {
 	int ret;
 
-	if (!adev->kfd.init_complete)
+	if (!adev->kfd.init_complete || adev->kfd.client.dev)
 		return 0;
 
 	ret = drm_client_init(&adev->ddev, &adev->kfd.client, "kfd",
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 586f4d03039df..64b1bb2404242 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2451,8 +2451,11 @@ static void amdgpu_drv_delayed_reset_work_handler(struct work_struct *work)
 	}
 	for (i = 0; i < mgpu_info.num_dgpu; i++) {
 		adev = mgpu_info.gpu_ins[i].adev;
-		if (!adev->kfd.init_complete)
+		if (!adev->kfd.init_complete) {
+			kgd2kfd_init_zone_device(adev);
 			amdgpu_amdkfd_device_init(adev);
+			amdgpu_amdkfd_drm_client_create(adev);
+		}
 		amdgpu_ttm_set_buffer_funcs_status(adev, true);
 	}
 }




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux