https://bugzilla.kernel.org/show_bug.cgi?id=216645 Bug ID: 216645 Summary: Fence fallback timer expired on ring gfx Product: Drivers Version: 2.5 Kernel Version: 5.15.0-43-generic Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-dri@xxxxxxxxxxxxxxxxxxxx Reporter: ask4support@xxxxxxxx Regression: No Created attachment 303109 --> https://bugzilla.kernel.org/attachment.cgi?id=303109&action=edit Kernel log created by the script in the menuetry Sometimes when I run a KDE system monitor, or Chrome, my laptop freezes and won't unfreeze until reboot (well, after a while I can move the mouse cursor, but that's all I can do). I'm using Dell G5 SE 5505 with AMD Ryzen 7 4800H as a CPU, Radeon RX Vega 7 as iGPU and AMD Radeon RX 5600M as dGPU. I've searched through existing bugs and found that it might be related to interrupts. With that in mind, I've compiled a list of kernel parameters which might be related and, as well as that, I've tested all of them: PW = Probably Working, NW = Not Working, NB = Not Booting PW pcie_port_pm=off PW amdgpu.msi=0 NW amd_iommu=fullflush NW amd_iommu=force_isolation NW amd_iommu=off NW amd_iommu_intr=legacy NW amd_iommu_intr=vapic kvm-amd.avic=1 NW iommu=off NW iommu=force NW iommu=noforce NW iommu=biomerge NW iommu=merge NW iommu=nomerge NW iommu=forcesac NW iommu=soft NW iommu=pt NW irqfixup NW irqpoll NW nointremap NW pcie_port_pm=force NW amdgpu.pcie_gen2=1 NW amdgpu.pcie_gen2=0 NW amdgpu.msi=1 NW amdgpu.lockup_timeout=1000 NW amdgpu.lockup_timeout=100 NW amdgpu.aspm=1 NW amdgpu.aspm=0 NW amdgpu.bapm=1 NW amdgpu.bapm=0 NW amdgpu.ppfeaturemask=0xfff7bff7 NW amdgpu.ppfeaturemask=0xfff7bdff NW amdgpu.ppfeaturemask=0xfff7bbff NW amdgpu.ppfeaturemask=0xfff73fff NW amdgpu.ppfeaturemask=0xfff3bfff NW amdgpu.exp_hw_support=1 NW amdgpu.exp_hw_support=0 NW amdgpu.forcelongtraining=0 NW amdgpu.forcelongtraining=1 NW amdgpu.cg_mask=0x00000000 NW amdgpu.cg_mask=0xffffffff NW amdgpu.pg_mask=0xffffffff NW amdgpu.ngg=1 NW amdgpu.ngg=0 NW amdgpu.job_hang_limit=1000 NW amdgpu.job_hang_limit=100 NW amdgpu.lbpw=1 NW amdgpu.lbpw=0 NW amdgpu.gpu_recovery=1 NW amdgpu.gpu_recovery=0 NW amdgpu.sched_policy=2 NW amdgpu.sched_policy=1 NW amdgpu.sched_policy=0 NW amdgpu.ignore_crat=0 NW amdgpu.ignore_crat=1 NW amdgpu.ras_enable=0 NW amdgpu.ras_enable=1 NW amdgpu.async_gfx_ring=0 NW amdgpu.async_gfx_ring=1 NW amdgpu.mcbp=1 NW amdgpu.mcbp=0 NW amdgpu.mes=0 NW amdgpu.mes_kiq=1 NW amdgpu.mes_kiq=0 NW amdgpu.reset_method=0 NW amdgpu.reset_method=1 NW amdgpu.reset_method=2 NW amdgpu.reset_method=3 NW amdgpu.reset_method=4 NW amdgpu.reset_method=-1 NW idle=nomwait NB amdgpu.pg_mask=0x00000000 NB amdgpu.mes=1 I've developed a script and a GRUB2 menu entry for live Kubuntu that triggers the freeze and saves the dmesg into a file called Freeze_Dell_G5_SE_5505.sh.log at the root of the drive it's being booted from. Replace the ISO variable value with the path to your iso file if it's not at root directory of the drive and/or if it's of a different version: menuentry "Start Kubuntu 22.04.1 (64 bit) without Ubiquity and with a freezing script" { ISO=/kubuntu-22.04.1-desktop-amd64.iso set gfxpayload=keep loopback loop "$ISO" probe -u $root --set=rootid linux (loop)/casper/vmlinuz iso-scan/filename="$ISO" file=/cdrom/preseed/kubuntu.seed maybe-ubiquity quiet splash init=/bin/sh -- -c 'for script in /home/kubuntu/Desktop/Freeze_Dell_G5_SE_5505.sh ; do for autorun in /home/kubuntu/.config/autostart/${script##*/} ; do ln -fs /dev/null /etc/systemd/system/graphical.target.wants/ubiquity.service ; mkdir -p ${script%/*} ${autorun%/*} ; printf \043!_/bin/sh++print\050\051_{+\tprintf_"@1"_,_seq_-s"_"_@\050\050_@\050stty_size_\074_@t_?_sed_"s/^/\050/,_s/_/_-_1_\051_*_/"\051_-_@{\0431}_\051\051_?_sed_s/[0-9]//g+}+t\075"@\050readlink_/proc/self/fd/0\051"++d\075"@\050env_LANG\075C_udisksctl_mount_-b_/dev/disk/by-uuid/$0_-o_sync_2\076_/dev/null_?_sed_"s/^Mounted_.*_at_//g,_s/\\.@//g"\051"+[_-d_"@d"_]_\046\046_f\075oflag\075direct_??_d\075"@{0%%/*}"+sudo_dmesg_-w_?_sudo_dd_of\075"@d/@{0\043\043*/}.log"_@f_\046+i\0750+seq_28_150000_?_while_read_N_,_do+\tprint_@N+\ttimeout_3_env_DISPLAY\075:0_plasma-systemmonitor_\076_/dev/null_2\076\0461+\tn\075@N_,_while_[_0_-lt_@n_]_,_do+\t\tsleep_1+\t\tn\075@\050\050_@n_-_1_\051\051+\t\ti\075@\050\050_@i_^_1_\051\051+\t\t[_"@i"_\075_1_]_\046\046_printf_"\\33[30m\\33[47m"_??_printf_"\\33[37m\\33[40m"+\t\tprint_@n+\tdone+done++echo_END!+exit+ | tr _,?@+ \40\73\174\044\n > $script ; printf [Desktop_Entry]\nType=Application\nExec=kstart_--maximize_--_konsole_-e_ | tr _ \40 > ${autorun%.sh}.desktop ; printf $script\n >> ${autorun%.sh}.desktop ; chmod +x $script ${autorun%.sh}.desktop ; chown -R kubuntu:kubuntu /home/kubuntu ; exec /sbin/init maybe-ubiquity splash --- ; done ; done' $rootid initrd (loop)/casper/initrd } The script generated on the live Kubuntu's desktop runs KDE's System Monitor for a three seconds and waits before running it again. With each iteration, it waits one second longer than before. The parameter passed the test if it managed not to freeze until the script was waiting for 50 seconds (now I'd recommend 60, as with 50 it sometimes froze after the second boot) for five boots in a row. Would someone also tell us which workaround should be used under which performace/latency requirements? ("Maybe wrong but still an" EXAMPLE: Users who need the best performace or lowest latency should use pcie_port_pm=off, users who need the best battery life should use amdgpu.msi=0.) If you fix the issue, may you please tell the users (not just developers) what was the problem? ("Maybe wrong but still an" EXAMPLE: The driver was waiting for an interrupt, but the bus was down, therefore the message-signalled interrupt could not have come and the operation timed out.) Thanks. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.