On Tue, Jun 28, 2022 at 2:21 PM Mikhail Gavrilov <mikhail.v.gavrilov@xxxxxxxxx> wrote: > Christian can you look why drm_aperture_remove_conflicting_pci_framebuffers cause this kernel bug on my machine? [ 6.822385] amdgpu: Ignoring ACPI CRAT on non-APU system [ 6.822462] amdgpu: Virtual CRAT table created for CPU [ 6.822654] amdgpu: Topology: Add CPU node [ 6.827643] Console: switching to colour dummy device 80x25 [ 6.845504] BUG: kernel NULL pointer dereference, address: 0000000000000038 [ 6.845509] #PF: supervisor read access in kernel mode [ 6.845512] #PF: error_code(0x0000) - not-present page [ 6.845515] PGD 0 P4D 0 [ 6.845518] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 6.845522] CPU: 27 PID: 612 Comm: systemd-udevd Tainted: G W -------- --- 5.19.0-0.rc5.20220705gitc1084b6c5620.40.fc37.x86_64 #1 [ 6.845528] Hardware name: System manufacturer System Product Name/ROG STRIX X570-I GAMING, BIOS 4403 04/27/2022 [ 6.845533] RIP: 0010:kernfs_find_and_get_ns+0x11/0x70 [ 6.845539] Code: 78 e8 c3 fa 31 00 48 85 c0 75 e1 eb 93 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 41 55 49 89 d5 41 54 49 89 f4 55 53 <48> 8b 47 38 48 89 fb 48 85 c0 48 0f 44 c7 48 8b a8 80 00 00 00 48 [ 6.845546] RSP: 0018:ffffa98c022f3aa0 EFLAGS: 00010246 [ 6.845550] RAX: 0000000000000000 RBX: ffffffffaf52c3c0 RCX: ffff9e150147b640 [ 6.845553] RDX: 0000000000000000 RSI: ffffffffaf52c508 RDI: 0000000000000000 [ 6.845557] RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000249249d4 [ 6.845560] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffffaf52c508 [ 6.845563] R13: 0000000000000000 R14: ffff9e157aa93900 R15: 0000000000000000 [ 6.845567] FS: 00007fabaafbf680(0000) GS:ffff9e23e6a00000(0000) knlGS:0000000000000000 [ 6.845571] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6.845574] CR2: 0000000000000038 CR3: 000000017cb56000 CR4: 0000000000350ee0 [ 6.845578] Call Trace: [ 6.845579] <TASK> [ 6.845582] sysfs_unmerge_group+0x18/0x60 [ 6.845585] dpm_sysfs_remove+0x20/0x60 [ 6.845590] device_del+0xa4/0x3f0 [ 6.845594] platform_device_del.part.0+0x13/0x70 [ 6.845599] platform_device_unregister+0x1c/0x30 [ 6.845602] sysfb_disable+0x2d/0x60 [ 6.845605] remove_conflicting_framebuffers+0x1b/0xc0 [ 6.845610] remove_conflicting_pci_framebuffers+0xce/0x120 [ 6.845614] drm_aperture_remove_conflicting_pci_framebuffers+0x57/0x80 [ 6.845620] amdgpu_pci_probe+0xcb/0x360 [amdgpu] [ 6.845760] local_pci_probe+0x41/0x80 [ 6.845764] pci_device_probe+0xaa/0x210 [ 6.845768] really_probe+0x1bf/0x390 [ 6.845771] __driver_probe_device+0xfc/0x170 [ 6.845775] driver_probe_device+0x1f/0x90 [ 6.845778] __driver_attach+0xbf/0x1b0 [ 6.845782] ? __device_attach_driver+0xe0/0xe0 [ 6.845785] bus_for_each_dev+0x65/0x90 [ 6.845789] bus_add_driver+0x15c/0x200 [ 6.845792] driver_register+0x89/0xe0 [ 6.845796] ? 0xffffffffc0c8d000 [ 6.845801] do_one_initcall+0x69/0x350 [ 6.845806] ? rcu_read_lock_sched_held+0x3c/0x70 [ 6.845810] ? trace_kmalloc+0x3c/0x100 [ 6.845814] ? kmem_cache_alloc_trace+0x1e8/0x350 [ 6.845818] do_init_module+0x4a/0x200 [ 6.845822] __do_sys_init_module+0x13a/0x190 [ 6.845827] do_syscall_64+0x5b/0x80 [ 6.845832] ? asm_exc_page_fault+0x27/0x30 [ 6.845835] ? lockdep_hardirqs_on+0x7d/0x100 [ 6.845839] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 6.845842] RIP: 0033:0x7fababb7463e [ 6.845845] Code: 48 8b 0d e5 57 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b2 57 0c 00 f7 d8 64 89 01 48 [ 6.845852] RSP: 002b:00007ffc6a6c9658 EFLAGS: 00000246 ORIG_RAX: 00000000000000af [ 6.845857] RAX: ffffffffffffffda RBX: 00005620deef53f0 RCX: 00007fababb7463e [ 6.845860] RDX: 00005620deeb2df0 RSI: 00000000010bfac6 RDI: 00007faba943e010 [ 6.845864] RBP: 00005620deeb2df0 R08: 00005620deef4880 R09: 0000000000000000 [ 6.845867] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000020000 [ 6.845870] R13: 00005620deeb5330 R14: 0000000000000000 R15: 00005620deef0410 [ 6.845875] </TASK> [ 6.845877] Modules linked in: amdgpu(+) drm_ttm_helper ttm iommu_v2 crct10dif_pclmul gpu_sched crc32_pclmul crc32c_intel drm_buddy drm_display_helper ucsi_ccg nvme igb typec_ucsi ghash_clmulni_intel ccp cec typec sp5100_tco nvme_core dca wmi ip6_tables ip_tables ipmi_devintf ipmi_msghandler fuse [ 6.845898] CR2: 0000000000000038 [ 6.845900] ---[ end trace 0000000000000000 ]--- $ /usr/src/kernels/5.19.0-0.rc5.20220705gitc1084b6c5620.40.fc37.x86_64/scripts/faddr2line /lib/debug/lib/modules/5.19.0-0.rc5.20220705gitc1084b6c5620.40.fc37.x86_64/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko.debug amdgpu_pci_probe+0xcb amdgpu_pci_probe+0xcb/0x360: amdgpu_pci_probe at /usr/src/debug/kernel-5.19-rc5-49-gc1084b6c5620/linux-5.19.0-0.rc5.20220705gitc1084b6c5620.40.fc37.x86_64/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2061 $ cat -s -n /usr/src/debug/kernel-5.19-rc5-49-gc1084b6c5620/linux-5.19.0-0.rc5.20220705gitc1084b6c5620.40.fc37.x86_64/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | head -2071 | tail -20 2052 "Use radeon.cik_support=0 amdgpu.cik_support=1 to override.\n" 2053 ); 2054 return -ENODEV; 2055 } 2056 } 2057 #endif 2058 2059 /* Get rid of things like offb */ 2060 ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, &amdgpu_kms_driver); 2061 if (ret) 2062 return ret; 2063 2064 adev = devm_drm_dev_alloc(&pdev->dev, &amdgpu_kms_driver, typeof(*adev), ddev); 2065 if (IS_ERR(adev)) 2066 return PTR_ERR(adev); 2067 2068 adev->dev = &pdev->dev; 2069 adev->pdev = pdev; 2070 ddev = adev_to_drm(adev); $ git blame -L 2052,2070 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c Blaming lines: 100% (19/19), done. 984d7a929ad68 (Hans de Goede 2019-10-10 18:28:17 +0200 2052) dev_info(&pdev->dev, 984d7a929ad68 (Hans de Goede 2019-10-10 18:28:17 +0200 2053) "Use radeon.cik_support=0 amdgpu.cik_support=1 to override.\n" 984d7a929ad68 (Hans de Goede 2019-10-10 18:28:17 +0200 2054) ); 984d7a929ad68 (Hans de Goede 2019-10-10 18:28:17 +0200 2055) return -ENODEV; 984d7a929ad68 (Hans de Goede 2019-10-10 18:28:17 +0200 2056) } 984d7a929ad68 (Hans de Goede 2019-10-10 18:28:17 +0200 2057) } 984d7a929ad68 (Hans de Goede 2019-10-10 18:28:17 +0200 2058) #endif 984d7a929ad68 (Hans de Goede 2019-10-10 18:28:17 +0200 2059) d38ceaf99ed01 (Alex Deucher 2015-04-20 16:55:21 -0400 2060) /* Get rid of things like offb */ 97c9bfe3f6605 (Thomas Zimmermann 2021-06-29 15:58:33 +0200 2061) ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, &amdgpu_kms_driver); d38ceaf99ed01 (Alex Deucher 2015-04-20 16:55:21 -0400 2062) if (ret) d38ceaf99ed01 (Alex Deucher 2015-04-20 16:55:21 -0400 2063) return ret; d38ceaf99ed01 (Alex Deucher 2015-04-20 16:55:21 -0400 2064) 5088d6572e8ff (Luben Tuikov 2020-11-04 11:04:25 +0100 2065) adev = devm_drm_dev_alloc(&pdev->dev, &amdgpu_kms_driver, typeof(*adev), ddev); df2ce4596c044 (Luben Tuikov 2020-09-18 15:25:04 +0200 2066) if (IS_ERR(adev)) df2ce4596c044 (Luben Tuikov 2020-09-18 15:25:04 +0200 2067) return PTR_ERR(adev); 8aba21b75136c (Luben Tuikov 2020-08-14 20:41:55 -0400 2068) 8aba21b75136c (Luben Tuikov 2020-08-14 20:41:55 -0400 2069) adev->dev = &pdev->dev; 8aba21b75136c (Luben Tuikov 2020-08-14 20:41:55 -0400 2070) adev->pdev = pdev; Thomas, you recently changed this line. Can you tell why we are catching kernel Oops here? Full kernel log (5.19-rc5): https://pastebin.com/5Ag804bd -- Best Regards, Mike Gavrilov.