Patch "drm/amdgpu: release gpu full access after "amdgpu_device_ip_late_init"" has been added to the 6.3-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    drm/amdgpu: release gpu full access after "amdgpu_device_ip_late_init"

to the 6.3-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     drm-amdgpu-release-gpu-full-access-after-amdgpu_devi.patch
and it can be found in the queue-6.3 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 5af526c608be17002e00dfcf356220f8e331dd53
Author: Chong Li <chongli2@xxxxxxx>
Date:   Fri Apr 14 13:51:19 2023 +0800

    drm/amdgpu: release gpu full access after "amdgpu_device_ip_late_init"
    
    [ Upstream commit 38eecbe086a4e52f54b2bbda8feba65d44addbef ]
    
    [WHY]
     Function "amdgpu_irq_update()" called by "amdgpu_device_ip_late_init()" is an atomic context.
     We shouldn't access registers through KIQ since "msleep()" may be called in "amdgpu_kiq_rreg()".
    
    [HOW]
     Move function "amdgpu_virt_release_full_gpu()" after function "amdgpu_device_ip_late_init()",
     to ensure that registers be accessed through RLCG instead of KIQ.
    
    Call Trace:
      <TASK>
      show_stack+0x52/0x69
      dump_stack_lvl+0x49/0x6d
      dump_stack+0x10/0x18
      __schedule_bug.cold+0x4f/0x6b
      __schedule+0x473/0x5d0
      ? __wake_up_klogd.part.0+0x40/0x70
      ? vprintk_emit+0xbe/0x1f0
      schedule+0x68/0x110
      schedule_timeout+0x87/0x160
      ? timer_migration_handler+0xa0/0xa0
      msleep+0x2d/0x50
      amdgpu_kiq_rreg+0x18d/0x1f0 [amdgpu]
      amdgpu_device_rreg.part.0+0x59/0xd0 [amdgpu]
      amdgpu_device_rreg+0x3a/0x50 [amdgpu]
      amdgpu_sriov_rreg+0x3c/0xb0 [amdgpu]
      gfx_v10_0_set_gfx_eop_interrupt_state.constprop.0+0x16c/0x190 [amdgpu]
      gfx_v10_0_set_eop_interrupt_state+0xa5/0xb0 [amdgpu]
      amdgpu_irq_update+0x53/0x80 [amdgpu]
      amdgpu_irq_get+0x7c/0xb0 [amdgpu]
      amdgpu_fence_driver_hw_init+0x58/0x90 [amdgpu]
      amdgpu_device_init.cold+0x16b7/0x2022 [amdgpu]
    
    Signed-off-by: Chong Li <chongli2@xxxxxxx>
    Reviewed-by: JingWen.Chen2@xxxxxxx
    Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index aa46726dfdb01..464f0abc855e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2523,8 +2523,6 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
 	amdgpu_fru_get_product_info(adev);
 
 init_failed:
-	if (amdgpu_sriov_vf(adev))
-		amdgpu_virt_release_full_gpu(adev, true);
 
 	return r;
 }
@@ -3841,18 +3839,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 
 	r = amdgpu_device_ip_init(adev);
 	if (r) {
-		/* failed in exclusive mode due to timeout */
-		if (amdgpu_sriov_vf(adev) &&
-		    !amdgpu_sriov_runtime(adev) &&
-		    amdgpu_virt_mmio_blocked(adev) &&
-		    !amdgpu_virt_wait_reset(adev)) {
-			dev_err(adev->dev, "VF exclusive mode timeout\n");
-			/* Don't send request since VF is inactive. */
-			adev->virt.caps &= ~AMDGPU_SRIOV_CAPS_RUNTIME;
-			adev->virt.ops = NULL;
-			r = -EAGAIN;
-			goto release_ras_con;
-		}
 		dev_err(adev->dev, "amdgpu_device_ip_init failed\n");
 		amdgpu_vf_error_put(adev, AMDGIM_ERROR_VF_AMDGPU_INIT_FAIL, 0, 0);
 		goto release_ras_con;
@@ -3924,8 +3910,10 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 				   msecs_to_jiffies(AMDGPU_RESUME_MS));
 	}
 
-	if (amdgpu_sriov_vf(adev))
+	if (amdgpu_sriov_vf(adev)) {
+		amdgpu_virt_release_full_gpu(adev, true);
 		flush_delayed_work(&adev->delayed_init_work);
+	}
 
 	r = sysfs_create_files(&adev->dev->kobj, amdgpu_dev_attributes);
 	if (r)
@@ -3965,6 +3953,20 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	return 0;
 
 release_ras_con:
+	if (amdgpu_sriov_vf(adev))
+		amdgpu_virt_release_full_gpu(adev, true);
+
+	/* failed in exclusive mode due to timeout */
+	if (amdgpu_sriov_vf(adev) &&
+		!amdgpu_sriov_runtime(adev) &&
+		amdgpu_virt_mmio_blocked(adev) &&
+		!amdgpu_virt_wait_reset(adev)) {
+		dev_err(adev->dev, "VF exclusive mode timeout\n");
+		/* Don't send request since VF is inactive. */
+		adev->virt.caps &= ~AMDGPU_SRIOV_CAPS_RUNTIME;
+		adev->virt.ops = NULL;
+		r = -EAGAIN;
+	}
 	amdgpu_release_ras_context(adev);
 
 failed:



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux