Ok - hacked out a patch that allows 6.11-rc4 to boot with out hanging - just disabling the "mes" stuff. See attached patch Yeah ! Andrew On Tue, 20 Aug 2024 at 00:13, Alex Deucher <alexdeucher@xxxxxxxxx> wrote: > > On Mon, Aug 19, 2024 at 9:55 AM Andrew Worsley <amworsley@xxxxxxxxx> wrote: > > > > The v6.11-rc4 linux hangs during amdgpu start up where as the v6.10.0 > > is fine. I had to take a photo of the screen (see attachment) from > > which I generated > > the following summary: > > > > Booting linux v6.11-rc4 : > > ... > > amdgpu: Virtual CRAT table created for CPU > > amdgpu: Topology: Add CPU node > > initializing kernel modesetting (IP DISCOVERY 0x1002:0x15BF 0xF111:0x0005 0xC2). > > register mmio base: 0x90500000 > > register mmio size: 524288 > > add ip block number 0 <soc21_common> > > add ip block number 1 <gmc_v11_0> > > add ip block number 2 <ih_v6_0> > > add ip block number 3 <psp> > > add ip block number 4 <smu> > > add ip block number 5 <dm> > > add ip block number 6 <gfx_v11_0> > > add ip block number 7 <sdma_v6_0> > > add ip block number 8 <vcn_v4_0> > > add ip block number 9 <jpeg_v4_0> > > add ip block number 10 <mes_v11_0> > > amdgpu 0000:c1:00.0: amdgpu: Fetched VBIOS from VFCT > > amdgpu: ATOM BIOS: 113-PHXGENERIC-001 > > amdgpu 0000:c1:00.0: Direct firmware load for > > amdgpu/gc_11_0_1_mes_2.bin failed with error -2 > > amdgpu 0000:c1:00.0: amdgpu: try to fall back to amdgpu/gc_11_0_1_mes.bin ....
From 535c5a73b945615bd1ea90db1d6d331fa9677252 Mon Sep 17 00:00:00 2001 From: Andrew Worsley <amworsley@xxxxxxxxx> Date: Tue, 20 Aug 2024 16:37:36 +1000 Subject: [PATCH] Fix amdgpu hang on boot by reverting f9d8c5c7855d8f3e4c3e678777d02a49046eafb0. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Revert "drm/amdgpu/gfx: enable mes to map legacy queue support" Disable the mes stuff - now doesn't hang on my AMD Ryzen™ 7040 Series framework 16inch laptop --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 44 ++----------------------- 1 file changed, 2 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c index c770cb201e64..f2fe7874c6da 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c @@ -509,16 +509,6 @@ int amdgpu_gfx_disable_kcq(struct amdgpu_device *adev, int xcc_id) int i, r = 0; int j; - if (adev->enable_mes) { - for (i = 0; i < adev->gfx.num_compute_rings; i++) { - j = i + xcc_id * adev->gfx.num_compute_rings; - amdgpu_mes_unmap_legacy_queue(adev, - &adev->gfx.compute_ring[j], - RESET_QUEUES, 0, 0); - } - return 0; - } - if (!kiq->pmf || !kiq->pmf->kiq_unmap_queues) return -EINVAL; @@ -561,18 +551,6 @@ int amdgpu_gfx_disable_kgq(struct amdgpu_device *adev, int xcc_id) int i, r = 0; int j; - if (adev->enable_mes) { - if (amdgpu_gfx_is_master_xcc(adev, xcc_id)) { - for (i = 0; i < adev->gfx.num_gfx_rings; i++) { - j = i + xcc_id * adev->gfx.num_gfx_rings; - amdgpu_mes_unmap_legacy_queue(adev, - &adev->gfx.gfx_ring[j], - PREEMPT_QUEUES, 0, 0); - } - } - return 0; - } - if (!kiq->pmf || !kiq->pmf->kiq_unmap_queues) return -EINVAL; @@ -657,9 +635,6 @@ int amdgpu_gfx_enable_kcq(struct amdgpu_device *adev, int xcc_id) uint64_t queue_mask = 0; int r, i, j; - if (adev->enable_mes) - return amdgpu_gfx_mes_enable_kcq(adev, xcc_id); - if (!kiq->pmf || !kiq->pmf->kiq_map_queues || !kiq->pmf->kiq_set_resources) return -EINVAL; @@ -678,10 +653,9 @@ int amdgpu_gfx_enable_kcq(struct amdgpu_device *adev, int xcc_id) queue_mask |= (1ull << amdgpu_queue_mask_bit_to_set_resource_bit(adev, i)); } - amdgpu_device_flush_hdp(adev, NULL); - DRM_INFO("kiq ring mec %d pipe %d q %d\n", kiq_ring->me, kiq_ring->pipe, - kiq_ring->queue); + kiq_ring->queue); + amdgpu_device_flush_hdp(adev, NULL); spin_lock(&kiq->ring_lock); r = amdgpu_ring_alloc(kiq_ring, kiq->pmf->map_queues_size * @@ -719,20 +693,6 @@ int amdgpu_gfx_enable_kgq(struct amdgpu_device *adev, int xcc_id) amdgpu_device_flush_hdp(adev, NULL); - if (adev->enable_mes) { - for (i = 0; i < adev->gfx.num_gfx_rings; i++) { - j = i + xcc_id * adev->gfx.num_gfx_rings; - r = amdgpu_mes_map_legacy_queue(adev, - &adev->gfx.gfx_ring[j]); - if (r) { - DRM_ERROR("failed to map gfx queue\n"); - return r; - } - } - - return 0; - } - spin_lock(&kiq->ring_lock); /* No need to map kcq on the slave */ if (amdgpu_gfx_is_master_xcc(adev, xcc_id)) { -- 2.39.2