The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable@xxxxxxxxxxxxxxx>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x 03ff6d7238b77e5fb2b85dc5fe01d2db9eb893bd # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable@xxxxxxxxxxxxxxx>' --in-reply-to '2024012704-bunkmate-pacifier-7cb5@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: 03ff6d7238b7 ("drm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs") 91963397c49a ("drm/amdgpu: Enable tunneling on high-priority compute queues") 6cb8e3ee3a08 ("drm/amdgpu: update ib start and size alignment") 7a41ed8b59ba ("drm/amdgpu: add new INFO ioctl query for the last GPU page fault") 2e8ef6a56129 ("drm/amdgpu: add cached GPU fault structure to vm struct") 934deb64fdf2 ("drm/amdgpu: Add memory partition id to amdgpu_vm") be3800f57c3b ("drm/amdgpu: find partition ID when open device") 2c1c7ba457d4 ("drm/amdgpu: support partition drm devices") 4bdca2057933 ("drm/amdgpu: Add utility functions for xcp") 75d1692393cb ("drm/amdgpu: Add initial version of XCP routines") ea2d2f8ececd ("drm/amdgpu: detect current GPU memory partition mode") 3d2ea552b229 ("drm/amdgpu: implement smuio v13_0_3 callbacks") 8078f1c610fd ("drm/amdgpu: Change num_xcd to xcc_mask") 36be0181eab5 ("drm/amdgpu: program GRBM_MCM_ADDR for non-AID0 GRBM") 5de6bd6a13f1 ("drm/amdgpu: set mmhub bitmask for multiple AIDs") ed42f2cc3b56 ("drm/amdgpu: correct the vmhub reference for each XCD in gfxhub init") 74c5b85da754 ("drm/amdkfd: Add spatial partitioning support in KFD") 8dc1db3172ae ("drm/amdkfd: Introduce kfd_node struct (v5)") e6a02e2cc7fe ("drm/amdgpu: Add some XCC programming") bfb44eacb0e2 ("drm/amdkfd: Set F8_MODE for gc_v9_4_3") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 03ff6d7238b77e5fb2b85dc5fe01d2db9eb893bd Mon Sep 17 00:00:00 2001 From: Alex Deucher <alexander.deucher@xxxxxxx> Date: Fri, 19 Jan 2024 12:23:55 -0500 Subject: [PATCH] drm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs This needs to be set to 1 to avoid a potential deadlock in the GC 10.x and newer. On GC 9.x and older, this needs to be set to 0. This can lead to hangs in some mixed graphics and compute workloads. Updated firmware is also required for AQL. Reviewed-by: Feifei Xu <Feifei.Xu@xxxxxxx> Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> Cc: stable@xxxxxxxxxxxxxxx diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index d63cab294883..ecb622b7f970 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -6589,7 +6589,7 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_device *adev, void *m, #ifdef __BIG_ENDIAN tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, ENDIAN_SWAP, 1); #endif - tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, UNORD_DISPATCH, 0); + tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, UNORD_DISPATCH, 1); tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, TUNNEL_DISPATCH, prop->allow_tunneling); tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, PRIV_STATE, 1); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c index 8b7fed913526..22cbfa1bdadd 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c @@ -170,6 +170,7 @@ static void update_mqd(struct mqd_manager *mm, void *mqd, m->cp_hqd_pq_control = 5 << CP_HQD_PQ_CONTROL__RPTR_BLOCK_SIZE__SHIFT; m->cp_hqd_pq_control |= ffs(q->queue_size / sizeof(unsigned int)) - 1 - 1; + m->cp_hqd_pq_control |= CP_HQD_PQ_CONTROL__UNORD_DISPATCH_MASK; pr_debug("cp_hqd_pq_control 0x%x\n", m->cp_hqd_pq_control); m->cp_hqd_pq_base_lo = lower_32_bits((uint64_t)q->queue_address >> 8);