On Fri, Jan 14, 2022 at 1:47 PM Limonciello, Mario <Mario.Limonciello@xxxxxxx> wrote: > > [Public] > > > > > > > > > > > > > > > >> I think the revert is fine once we figure out where we're missing calls to: > > > > >> > > > > >> .optimize_pwr_state = dcn21_optimize_pwr_state, > > > > >> .exit_optimized_pwr_state = dcn21_exit_optimized_pwr_state, > > > > >> > > > > >> These are already part of dc_link_detect, so I suspect there's another > > > > interface > > > > >> in DC that should be using these. > > > > >> > > > > >> I think the best way to debug this is to revert the patch locally and add a > > > stack > > > > >> dump when DMCUB hangs our times out. > > > > > OK so I did this on top of amd-staging-drm-next with my v5 patch (this > > > revert in > > > > place) > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c > > > > b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c > > > > > index 9280f2abd973..0bd32f82f3db 100644 > > > > > --- a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c > > > > > +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c > > > > > @@ -789,8 +789,10 @@ enum dmub_status > > > > dmub_srv_cmd_with_reply_data(struct dmub_srv *dmub, > > > > > // Execute command > > > > > status = dmub_srv_cmd_execute(dmub); > > > > > > > > > > - if (status != DMUB_STATUS_OK) > > > > > + if (status != DMUB_STATUS_OK) { > > > > > + ASSERT(0); > > > > > return status; > > > > > + } > > > > > > > > > > // Wait for DMUB to process command > > > > > status = dmub_srv_wait_for_idle(dmub, 100000); > > > > > > > > > >> That way you can know where the PHY was trying to be accessed > > > without the > > > > >> refclk being on. > > > > >> > > > > >> We had a similar issue in DCN31 which didn't require a W/A like DCN21. > > > > >> > > > > >> I'd like to hold off on merging this until that hang is verified as gone. > > > > >> > > > > > Then I took a RN laptop running DMUB 0x01010019 and disabled eDP, and > > > > confirmed > > > > > no CRTC was configured but plugged in an HDMI cable: > > > > > > > > > > connector[78]: eDP-1 > > > > > crtc=(null) > > > > > self_refresh_aware=0 > > > > > connector[85]: HDMI-A-1 > > > > > crtc=crtc-1 > > > > > self_refresh_aware=0 > > > > > > > > > > I triggered 100 hotplugs like this: > > > > > > > > > > #!/bin/bash > > > > > for i in {0..100..1} > > > > > do > > > > > echo 1 | tee /sys/kernel/debug/dri/0/HDMI-A-1/trigger_hotplug > > > > > sleep 3 > > > > > done > > > > > > > > > > Unfortunately, no hang or traceback to be seen (and HDMI continues to > > > work). > > > > > I also manually pulled the plug a handful of times I don't know the > > > specifics > > > > that Lillian had the > > > > > failure though, so this might not be a good enough check. > > > > > > > > > > I'll try to upgrade DMUB to 0x101001c (the latest version) and double > > > check > > > > that as well. > > > > > > > > I applied patch v5 and the above ASSERT patch, on top of both Linux > > > > 5.16-rc8 and 5.16. > > > > > > > > Result: no problems with suspend/resume, 16+ cycles. > > > > > > > > As far as the hang goes: > > > > > > > > I plugged in an HDMI cable connected to my TV, and configured Gnome to > > > > use the external display only. > > > > > > > > connectors from /sys/kernel/debug/dri/0/state: > > > > > > > > connector[78]: eDP-1 > > > > crtc=(null) > > > > self_refresh_aware=0 > > > > connector[85]: HDMI-A-1 > > > > crtc=crtc-1 > > > > self_refresh_aware=0 > > > > connector[89]: DP-1 > > > > crtc=(null) > > > > self_refresh_aware=0 > > > > > > > > I manually unplugged/plugged the HDMI cable 16+ times, and also ran: > > > > > > > > $ sudo sh -c 'for ((i=0;i<100;i++)); do echo 1 | tee > > > > /sys/kernel/debug/dri/0/HDMI-A-1/trigger_hotplug; sleep 3; done' > > > > > > > > The system did not hang, and I saw no kernel log output from the ASSERT. > > > > > > > > I also tried a USB-C dock with an HDMI port, with the same results, > > > > though there are other issues with this (perhaps worthy of other bug > > > > reports). > > > > > > > > Is there some reason to use amd-staging-drm-next for this test? > > > > > > > > I don't use the HDMI connection much and I have never experienced a > > > hang > > > > with HDMI in the first place. Can someone send a link to an > > > > issue/discussion where this hang is being discussed? > > > > > > > > HW: HP ENVY x360 Convertible 15-ds1xxx, AMD Ryzen 7 4700U with > > > Radeon > > > > Graphics > > > > OS/Desktop: Arch Linux, Gnome 41.3 (Wayland) > > > > FW: linux-firmware-git 20211229.57d6b95-1, DMUB version=0x0101001C > > > > > > > > > > Nicholas, > > > > > > We've got a handful of people now (myself included) who have done a > > > bunch of > > > physical and software triggered hotplugs on a variety of ports on top of both > > > amd-staging-drm-next and 5.16 and not seeing any hangs. Given this is > > > lingering > > > on 5.16, are you amenable to it and letting Lillian dig further after she returns > > > on > > > the specific case that she had problems with to see if we're missing anything > > > else? > > > > > > Thanks, > > > > I think it was observed during HDMI compliance testing or frequent HDCP > > enter/exit on Chrome, I don't remember the details off the top of my head. The > > system would completely lock up under those conditions. > > > > I'm not familiar with the urgency of the request for your specific issue, but if you > > feel that the tradeoff is worth it then you can go ahead and revert for now. > > > > Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@xxxxxxx> > > > > Regards, > > Nicholas Kazlauskas > > Thanks. Alex, when this pulls in can you add CC for stable so we get it in 5.16.1 too? Yes, will do. Alex