On Sun, Sep 15, 2024 at 5:28 PM Sitsofe Wheeler <sitsofe@xxxxxxxxx> wrote: > > Hello, > > (Apologies if I have CC'd the wrong people/places - I just went by > what get_maintainer.pl -f drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > said) > > I recently upgraded from Ubuntu 20.04 (5.15.0-119.129~20.04.1-generic > kernel) to Ubuntu 24.04 (6.8.0-44-generic kernel) and found that while > booting the kernel hangs for around 15 seconds just before the amdgpu > driver is loaded: > > [ 4.459519] radeon 0000:01:05.0: [drm] Cannot find any crtc or sizes > [ 4.460118] probe of 0000:01:05.0 returned 0 after 902266 usecs > [ 4.460184] initcall radeon_module_init+0x0/0xff0 [radeon] returned > 0 after 902473 usecs > [ 4.465797] calling drm_buddy_module_init+0x0/0xff0 [drm_buddy] @ 122 > [ 4.465853] initcall drm_buddy_module_init+0x0/0xff0 [drm_buddy] > returned 0 after 29 usecs > [ 4.469419] radeon 0000:01:05.0: [drm] Cannot find any crtc or sizes > [ 4.473831] calling drm_sched_fence_slab_init+0x0/0xff0 [gpu_sched] @ 122 > [ 4.473892] initcall drm_sched_fence_slab_init+0x0/0xff0 > [gpu_sched] returned 0 after 31 usecs > [ 18.724442] calling amdgpu_init+0x0/0xff0 [amdgpu] @ 122 > [ 18.726303] [drm] amdgpu kernel modesetting enabled. > [ 18.726576] amdgpu: Virtual CRAT table created for CPU > [ 18.726609] amdgpu: Topology: Add CPU node > [ 18.726787] initcall amdgpu_init+0x0/0xff0 [amdgpu] returned 0 > after 528 usecs > > I've checked and the problem still exists in 6.11.0-061100rc7-generic > (which is close to vanilla upstream). > > The graphics card I have is: > 01:05.0 VGA compatible controller: Advanced Micro Devices, Inc. > [AMD/ATI] RS880M [Mobility Radeon HD 4225/4250] (prog-if 00 [VGA > controller]) > 01:05.0 0300: 1002:9712 (prog-if 00 [VGA controller]) > Subsystem: 103c:1609 > > At first I thought the problem was related to the change > https://github.com/torvalds/linux/commit/eb4fd29afd4aa1c98d882800ceeee7d1f5262803 > ("drm/amdgpu: bind to any 0x1002 PCI diplay [sic] class device") which > now means my card is claimed by two drivers (radeon and amdgpu). That > change complicated things because: > - The amdgpu module and its dependencies remain permanently present (which > never used to happen) > - It took some time for me to realise that the amdgpu driver hadn't suddenly > grown the ability to support this old card :-) There is a nice table on > https://www.x.org/wiki/RadeonFeature/#decoderringforengineeringvsmarketingnames > that shows it is part of the R600 family and > https://www.x.org/wiki/RadeonFeature/#featurematrixforfreeradeondrivers shows > that R600 is only supported by the radeon driver. > > However, testing a 5.16.20-051620-generic kernel showed that while the > amdgpu module is loaded, there is no 15 second hang... So far my > testing has the following results: > - 5.16.20-051620-generic - amdgpu loaded, no hang > - 5.18.19-051819-generic - amdgpu loaded, no hang > - 6.0.0-060000-generic - amdgpu loaded, hang > - 6.2.0-060200-generic - amdgpu loaded, hang > - 6.8.0-44-generic - amdgpu loaded, hang > - 6.11.0-061100rc7-generic - amdgpu loaded, hang > > To work around the problem I've taken to blacklisting amdgpu in > /etc/modprobe.d/ which makes the hang disappear. > > Does anyone else see this issue? Is there something better than my > current workaround? What do other drivers that want to bind to such a > large set of devices do? Further, while I'm already using > initcall_debug, is there any other kernel boot parameter to make > what's happening more visible? Do you have secureboot enabled? If so, perhaps this is relevant: https://bugzilla.kernel.org/show_bug.cgi?id=219229 Alex > > -- > Sitsofe