On Tue Mar 5, 2024 at 12:32 AM UTC, Joel Fernandes wrote: > FWIW, I use a Windows machine that has WSL2 (kernel version > 5.15.133.1-microsoft-standard-WSL2) and I have never experienced any kind of > hang. Though, this is a desktop and not a laptop or battery powered device. Is that also an ARM64 machine, because I have never seen this happen on a x86_64 machine, there it runs like a charm. Out of curiousity, if you are running an ARM64 Desktop. If I may as, which one, as the Volterra Development Kit is not available in the Netherlands. > > > > It also happens when I build the kernel myself from a more recent > > release: > > - https://github.com/maxboone/SQ2-Linux-Kernel-Builds > > > > Microsoft should have a Development Kit (Volterra) with identical hardware > > to mine (and other Surface Pro X, Surface Pro 9 users) that run into the > > same issue with WSL2. > > Right, so at least that's a data point, that its Surface-specific (?). Have you > tried to disable power management and see if it occurs? Like disable suspend, > disable cpuidle, etc. It also happens on non-Surface (but indeed mobile) devices, such as Lenovo ThinkPads. However, the common denominator might be the Qualcomm 8cx chip (that Microsoft uses as SQ{1,2,3} -> 8cx Gen{1,2,3} with a beefier GPU). Changes to power management settings in Windows don't seem to have effect other than stalls taking longer to occur when the device never sleeps. But the stalls also happen (often) when it doesn't sleep. Power management in WSL2 seems to be all but available: ``` root@ProX2024:~# uname -r 6.7.7-WSL2-STABLE+ root@ProX2024:~# echo freeze > /sys/power/state -bash: echo: write error: Function not implemented root@ProX2024:~# ls /sys/devices/system/cpu/ cpu0 cpu2 cpu4 cpu6 cpufreq kernel_max offline possible present vulnerabilities cpu1 cpu3 cpu5 cpu7 isolated modalias online power uevent ``` However available in Hyper-V: ``` root@ubuntu0:~# uname -r 6.5.0-21-generic root@ubuntu0:~# echo freeze > /sys/power/state root@ubuntu0:~# ls /sys/devices/system/cpu cpu0 cpu2 cpufreq hotplug kernel_max offline possible present uevent cpu1 cpu3 cpuidle isolated modalias online power smt vulnerabilities ``` > Have you tried to reproduce the issue with CONFIG_RSEQ=n and see if it happens? Will build a new kernel today with that flag, and report back. > Also this github thread looks awfully similar to the github thread you pointed > and has the same clear_rseq signature leading to the RCU stall. Over there also > it is a hang, but they say the CPU usage is at 100%: > https://github.com/microsoft/WSL/issues/8529 Indeed, when the RCU stalls occur, the CPU of the core that is stalling ramps up to 100%. I had thought that was an effect of the stall, but will check if the 100% usage is caused by the process that is stalling. Cheers, Max.