https://bugzilla.kernel.org/show_bug.cgi?id=210321 Tristen Hayfield (tristen.hayfield@xxxxxxxxx) changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tristen.hayfield@xxxxxxxxx --- Comment #4 from Tristen Hayfield (tristen.hayfield@xxxxxxxxx) --- I'm seeing this on the 5.10.* series as well. Currently 5.10.23, Gentoo. Radeon RX 5500 XT. Looking at the offending section of code, it seems an assertion is being triggered: // Fallback: Try to only support G6 temperature read latency context->bw_ctx.dml.soc.dram_clock_change_latency_us = context->bw_ctx.dml.soc.dummy_pstate_latency_us; voltage_supported = dcn20_validate_bandwidth_internal(dc, context, false); dummy_pstate_supported = context->bw_ctx.bw.dcn.clk.p_state_change_support; if (voltage_supported && dummy_pstate_supported) { context->bw_ctx.bw.dcn.clk.p_state_change_support = false; goto restore_dml_state; } // ERROR: fallback is supposed to always work. ASSERT(false); So one of (or both) voltage_supported and dummy_pstate_supported are evaluating to false here and falling through to the assertions. Stack trace attached for completeness' sake. Hopefully a dev that understands the hardware will take a look at this one day and find it helpful. [ 642.193449] WARNING: CPU: 22 PID: 3546 at drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_resource.c:3242 dcn20_validate_bandwidth_fp+0xd3/0xf0 [amdgpu] [ 642.193450] Modules linked in: fuse nfs lockd grace nfs_ssc sunrpc k10temp amdgpu backlight gpu_sched snd_hda_codec_hdmi ttm iwlmvm iwlwifi acpi_cpufreq efivarfs [ 642.193457] CPU: 22 PID: 3546 Comm: X Not tainted 5.10.23-gentoo #1 [ 642.193457] Hardware name: System manufacturer System Product Name/TUF GAMING X570-PLUS (WI-FI), BIOS 3402 01/13/2021 [ 642.193487] RIP: 0010:dcn20_validate_bandwidth_fp+0xd3/0xf0 [amdgpu] [ 642.193488] Code: 5d 41 5c c3 5b 48 89 ee 4c 89 e7 5d ba 01 00 00 00 41 5c e9 2f f6 ff ff 41 0f b6 f4 48 c7 c7 a0 a8 8c c0 31 c0 e8 8d 09 14 d9 <0f> 0b 48 89 9d 50 26 00 00 44 89 e0 5b 5d 41 5c c3 0f 0b e9 53 ff [ 642.193489] RSP: 0018:ffffc18284b37b40 EFLAGS: 00010246 [ 642.193490] RAX: 0000000000000000 RBX: 4079400000000000 RCX: 0000000000000000 [ 642.193490] RDX: 0000000000000000 RSI: ffff9ea22f197380 RDI: ffff9ea22f197380 [ 642.193491] RBP: ffff9e93ab0e0000 R08: 0000000000000000 R09: ffffc18284b37910 [ 642.193492] R10: ffffc18284b37908 R11: ffffffff9a722228 R12: 0000000000000001 [ 642.193492] R13: 0000000000000000 R14: ffff9e93ab0e0000 R15: ffff9e9344e5b560 [ 642.193493] FS: 00007fc5f22978c0(0000) GS:ffff9ea22f180000(0000) knlGS:0000000000000000 [ 642.193494] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 642.193494] CR2: 000055fa36a96628 CR3: 00000001155b2000 CR4: 0000000000350ee0 [ 642.193495] Call Trace: [ 642.193526] dcn20_validate_bandwidth+0x24/0x40 [amdgpu] [ 642.193548] dc_validate_global_state+0x284/0x300 [amdgpu] [ 642.193580] amdgpu_dm_atomic_check+0xb09/0xc00 [amdgpu] [ 642.193584] drm_atomic_check_only+0x555/0x7d0 [ 642.193585] drm_atomic_commit+0xe/0x50 [ 642.193586] drm_atomic_connector_commit_dpms+0xd5/0xf0 [ 642.193588] drm_mode_obj_set_property_ioctl+0x184/0x3a0 [ 642.193589] ? drm_connector_set_obj_prop+0x80/0x80 [ 642.193590] drm_connector_property_set_ioctl+0x32/0x50 [ 642.193592] drm_ioctl_kernel+0xa5/0xf0 [ 642.193593] drm_ioctl+0x20a/0x3a0 [ 642.193594] ? drm_connector_set_obj_prop+0x80/0x80 [ 642.193614] amdgpu_drm_ioctl+0x44/0x80 [amdgpu] [ 642.193616] __x64_sys_ioctl+0x81/0xa0 [ 642.193618] do_syscall_64+0x33/0x80 [ 642.193620] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 642.193621] RIP: 0033:0x7fc5f24cb227 [ 642.193622] Code: 1f 40 00 48 89 d8 49 8d 3c 1c 48 f7 d8 49 39 c4 72 b1 e8 0c ff ff ff 85 c0 78 b6 5b 4c 89 e0 5d 41 5c c3 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 11 6c 0c 00 f7 d8 64 89 01 48 [ 642.193622] RSP: 002b:00007fff122b98c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 642.193623] RAX: ffffffffffffffda RBX: 00007fff122b9900 RCX: 00007fc5f24cb227 [ 642.193624] RDX: 00007fff122b9900 RSI: 00000000c01064ab RDI: 000000000000000c [ 642.193624] RBP: 00000000c01064ab R08: 0000000000000000 R09: 00007fc5f2b97d10 [ 642.193625] R10: 00007fc5f2b97d20 R11: 0000000000000246 R12: 000055fa38755350 [ 642.193625] R13: 000000000000000c R14: 0000000000000000 R15: 000055fa36abf540 [ 642.193626] ---[ end trace b1edc8bf2eac897c ]--- I added the following line before the assertion and recompiled the kernel: DC_LOG_ERROR("voltage_supported: %d, dummy_pstate_supported: %d\n", voltage_supported, dummy_pstate_supported); When the issue triggered again, it logged: [drm:dcn20_validate_bandwidth_fp [amdgpu]] *ERROR* voltage_supported: 1, dummy_pstate_supported: 0 So in my case the assertion is being triggered because dummy_pstate_supported is false and the fallback is not working as intended. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel