On Sat, Apr 8, 2023 at 8:56 PM Luna Celeste <luna@xxxxxxxxxxxx> wrote: > > I have a System76 Thelio machine running Arch (fully up to date as of a > few hours ago today). For quite a while now (at least a few months), > it's been randomly hanging. I've configured the system to provide a > crash dump per the Kdump wiki page, but whenever I reboot it after one > of these hangs (or use the sysrq crash key), there is no /proc/vmcore > file, and there's nothing in the logs. As you can understand, this makes > troubleshooting difficult. > > In addition to this, I've not been able to identify many common factors > of the hangs. The only thing that seems to be consistent is that most of > the hangs occur when the system puts the display to sleep. But even this > isn't consistent; today's hang occurred while the display was active > (though I turned off the monitor). Hi Luna, I've been having what appears to be the same issue, but on a Dell Inspiron 7572. When it happens, I also see a blinking Caps Lock (indicating a kernel panic) just a few seconds after suspending or turning off the screen. Does this detail match as well? In my case, not using the coretemp driver seemed to help for a while, and there was a patch on the HWMON list that I was hoping would solve the issue for good.[1] Unfortunately I just saw that the patch has been in the Arch kernel for a few versions already, and yesterday I started having these issues again (after ~5 months). But it also turns out that I never actually blocklisted coretemp back in November, and in the past few days I've been leaving btop open, which also reads from coretemp (if available). So there's still some correlation... and for the lack of a better idea, I've now blocklisted coretemp, and will see if that makes a difference over the next few days/weeks. Maybe you can try doing the same, if you're also on Intel. [1]: https://lore.kernel.org/linux-hwmon/20230103114620.15319-1-janusz.krzysztofik@xxxxxxxxxxxxxxx/ Cheers, Jonas > > I've changed hardware and operating systems a few times. I originally > ran PopOS on this machine, and experienced hangs there with an NVIDIA > graphics card, but I moved to AMD when installing Manjaro. Again, to be > clear, the machine does have a very up to date Arch install presently. > The hangs occur with both X11 and Wayland. Sometimes it happens a few > hours after I boot the machine, sometimes (like today's) it took a week > or two. > > About the only unusual thing about the system is that I'm running ZFS. > The root filesystem for this installation runs on ZFS, but this wasn't > the case when it ran PopOS. I still had a ZFS pool on the PopOS machine. > I have another ZFS pool with my home directory and my libvirt VMs on it. > That said, neither of the pools have any errors, and I scrub them > regularly. > > As I said at the start of this message, I've been struggling to figure > out the problem for a few months and I'm not sure how to make progress > troubleshooting it. I don't see anything in dmesg about overheating, or > hardware failures, or anything like that. The machine works great... > until it very suddenly doesn't. > > So how do I go about figuring out what's going on? I'd love to be able > to > > -- > Cheers, > Luna Celeste