On 5/10/23 09:20, Michel Dänzer wrote: > On 5/9/23 23:07, Pillai, Aurabindo wrote: >> >> Sorry - the firmware in the previous message is for DCN32. For Navi2x, please use the firmware attached here. > > Same problem (contents of /sys/kernel/debug/dri/0/amdgpu_firmware_info below). > > Even if it did work with newer FW, the kernel must keep working with older FW, so in that case the new behaviour would need to be guarded by the FW version. > Agreed. Were you able to repro the hang on any other modes/monitors? > > VCE feature version: 0, firmware version: 0x00000000 > UVD feature version: 0, firmware version: 0x00000000 > MC feature version: 0, firmware version: 0x00000000 > ME feature version: 44, firmware version: 0x00000040 > PFP feature version: 44, firmware version: 0x00000061 > CE feature version: 44, firmware version: 0x00000025 > RLC feature version: 1, firmware version: 0x00000060 > RLC SRLC feature version: 0, firmware version: 0x00000000 > RLC SRLG feature version: 0, firmware version: 0x00000000 > RLC SRLS feature version: 0, firmware version: 0x00000000 > RLCP feature version: 0, firmware version: 0x00000000 > RLCV feature version: 0, firmware version: 0x00000000 > MEC feature version: 44, firmware version: 0x00000071 > MEC2 feature version: 44, firmware version: 0x00000071 > IMU feature version: 0, firmware version: 0x00000000 > SOS feature version: 0, firmware version: 0x00210c64 > ASD feature version: 553648297, firmware version: 0x210000a9 > TA XGMI feature version: 0x00000000, firmware version: 0x2000000f > TA RAS feature version: 0x00000000, firmware version: 0x1b00013e > TA HDCP feature version: 0x00000000, firmware version: 0x17000038 > TA DTM feature version: 0x00000000, firmware version: 0x12000015 > TA RAP feature version: 0x00000000, firmware version: 0x07000213 > TA SECUREDISPLAY feature version: 0x00000000, firmware version: 0x00000000 > SMC feature version: 0, program: 0, firmware version: 0x003a5800 (58.88.0) > SDMA0 feature version: 52, firmware version: 0x00000053 > SDMA1 feature version: 52, firmware version: 0x00000053 > SDMA2 feature version: 52, firmware version: 0x00000053 > SDMA3 feature version: 52, firmware version: 0x00000053 > VCN feature version: 0, firmware version: 0x0211b000 > DMCU feature version: 0, firmware version: 0x00000000 > DMCUB feature version: 0, firmware version: 0x0202001c > TOC feature version: 0, firmware version: 0x00000000 > MES_KIQ feature version: 0, firmware version: 0x00000000 > MES feature version: 0, firmware version: 0x00000000 > VBIOS version: 113-D4300100-051 > > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >> *From:* Pillai, Aurabindo <Aurabindo.Pillai@xxxxxxx> >> *Sent:* Tuesday, May 9, 2023 4:44 PM >> *To:* Michel Dänzer <michel@xxxxxxxxxxx>; Zhuo, Qingqing (Lillian) <Qingqing.Zhuo@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>; Chalmers, Wesley <Wesley.Chalmers@xxxxxxx> >> *Cc:* Wang, Chao-kai (Stylon) <Stylon.Wang@xxxxxxx>; Li, Sun peng (Leo) <Sunpeng.Li@xxxxxxx>; Wentland, Harry <Harry.Wentland@xxxxxxx>; Siqueira, Rodrigo <Rodrigo.Siqueira@xxxxxxx>; Li, Roman <Roman.Li@xxxxxxx>; Chiu, Solomon <Solomon.Chiu@xxxxxxx>; Lin, Wayne <Wayne.Lin@xxxxxxx>; Lakha, Bhawanpreet <Bhawanpreet.Lakha@xxxxxxx>; Gutierrez, Agustin <Agustin.Gutierrez@xxxxxxx>; Kotarac, Pavle <Pavle.Kotarac@xxxxxxx> >> *Subject:* Re: [PATCH 10/66] drm/amd/display: Do not set drr on pipe commit >> >> Hi Michel, >> >> Could you please try with the attached firmware package if you see the hang without any reverts? If you do see hangs, please send dmesg with "drm.debug=0x156 log_buf_len=30M" in the kernel cmdline. >> >> The attached fw is not released to the public yet, but we will be updating them in linux-firmware tree next week. Please do backup your existing firmware, and put the attached files into /usr/lib/firmware/updates/amgpu and regenerate your ramdisk. On ubuntu the following should do: >> >> sudo update-initramfs -u -k `uname -r` >> >> -- >> >> Regards, >> Jay >> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >> *From:* Michel Dänzer <michel@xxxxxxxxxxx> >> *Sent:* Tuesday, May 9, 2023 6:59 AM >> *To:* Zhuo, Qingqing (Lillian) <Qingqing.Zhuo@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>; Chalmers, Wesley <Wesley.Chalmers@xxxxxxx> >> *Cc:* Wang, Chao-kai (Stylon) <Stylon.Wang@xxxxxxx>; Li, Sun peng (Leo) <Sunpeng.Li@xxxxxxx>; Wentland, Harry <Harry.Wentland@xxxxxxx>; Siqueira, Rodrigo <Rodrigo.Siqueira@xxxxxxx>; Li, Roman <Roman.Li@xxxxxxx>; Chiu, Solomon <Solomon.Chiu@xxxxxxx>; Pillai, Aurabindo <Aurabindo.Pillai@xxxxxxx>; Lin, Wayne <Wayne.Lin@xxxxxxx>; Lakha, Bhawanpreet <Bhawanpreet.Lakha@xxxxxxx>; Gutierrez, Agustin <Agustin.Gutierrez@xxxxxxx>; Kotarac, Pavle <Pavle.Kotarac@xxxxxxx> >> *Subject:* Re: [PATCH 10/66] drm/amd/display: Do not set drr on pipe commit >> >> On 4/14/23 17:52, Qingqing Zhuo wrote: >>> From: Wesley Chalmers <Wesley.Chalmers@xxxxxxx> >>> >>> [WHY] >>> Writing to DRR registers such as OTG_V_TOTAL_MIN on the same frame as a >>> pipe commit can cause underflow. >>> >>> [HOW] >>> Move DMUB p-state delegate into optimze_bandwidth; enabling FAMS sets >>> optimized_required. >>> >>> This change expects that Freesync requests are blocked when >>> optimized_required is true. >>> >>> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@xxxxxxx> >>> Signed-off-by: Wesley Chalmers <Wesley.Chalmers@xxxxxxx> >> I bisected a regression to this change, see below for the symptoms. Reverting this patch (and the following patch "drm/amd/display: Block optimize on consecutive FAMS enables", which depends on it) on top of the DRM changes merged for 6.4-rc1 avoids the regression. >> >> Maybe "Freesync requests are blocked when optimized_required is true" isn't ensured as needed? >> >> >> The symptoms are that the monitor (Samsung Odyssey Neo G9, 5120x1440@240/VRR, connected to Navi 21 via DisplayPort) blanks and the GPU hangs while starting the Steam game Assetto Corsa Competizione (via Proton 7.0). >> >> Example dmesg excerpt: >> >> amdgpu 0000:0c:00.0: [drm] *ERROR* [CRTC:82:crtc-0] flip_done timed out >> NMI watchdog: Watchdog detected hard LOCKUP on cpu 6 >> [...] >> RIP: 0010:amdgpu_device_rreg.part.0+0x2f/0xf0 [amdgpu] >> Code: 41 54 44 8d 24 b5 00 00 00 00 55 89 f5 53 48 89 fb 4c 3b a7 60 0b 00 00 73 6a 83 e2 02 74 29 4c 03 a3 68 0b 00 00 45 8b 24 24 <48> 8b 43 08 0f b7 70 3e 66 90 44 89 e0 5b 5d 41 5c 31 d2 31 c9 31 >> RSP: 0000:ffffb39a119dfb88 EFLAGS: 00000086 >> RAX: ffffffffc0eb96a0 RBX: ffff9e7963dc0000 RCX: 0000000000007fff >> RDX: 0000000000000000 RSI: 0000000000004ff6 RDI: ffff9e7963dc0000 >> RBP: 0000000000004ff6 R08: ffffb39a119dfc40 R09: 0000000000000010 >> R10: ffffb39a119dfc40 R11: ffffb39a119dfc44 R12: 00000000000e05ae >> R13: 0000000000000000 R14: ffff9e7963dc0010 R15: 0000000000000000 >> FS: 000000001012f6c0(0000) GS:ffff9e805eb80000(0000) knlGS:000000007fd40000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 00000000461ca000 CR3: 00000002a8a20000 CR4: 0000000000350ee0 >> Call Trace: >> <TASK> >> dm_read_reg_func+0x37/0xc0 [amdgpu] >> generic_reg_get2+0x22/0x60 [amdgpu] >> optc1_get_crtc_scanoutpos+0x6a/0xc0 [amdgpu] >> dc_stream_get_scanoutpos+0x74/0x90 [amdgpu] >> dm_crtc_get_scanoutpos+0x82/0xf0 [amdgpu] >> amdgpu_display_get_crtc_scanoutpos+0x91/0x190 [amdgpu] >> ? dm_read_reg_func+0x37/0xc0 [amdgpu] >> amdgpu_get_vblank_counter_kms+0xb4/0x1a0 [amdgpu] >> dm_pflip_high_irq+0x213/0x2f0 [amdgpu] >> amdgpu_dm_irq_handler+0x8a/0x200 [amdgpu] >> amdgpu_irq_dispatch+0xd4/0x220 [amdgpu] >> amdgpu_ih_process+0x7f/0x110 [amdgpu] >> amdgpu_irq_handler+0x1f/0x70 [amdgpu] >> __handle_irq_event_percpu+0x46/0x1b0 >> handle_irq_event+0x34/0x80 >> handle_edge_irq+0x9f/0x240 >> __common_interrupt+0x66/0x110 >> common_interrupt+0x5c/0xd0 >> asm_common_interrupt+0x22/0x40 >> >> >> -- >> Earthling Michel Dänzer | https://redhat.com <https://redhat.com> >> Libre software enthusiast | Mesa and Xwayland developer >> >