Re: System is randomly freezing, would like troubleshooting help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



Hi,

On 14 April 2023 04:06:56 CEST, Luna Celeste <luna@xxxxxxxxxxxx> wrote:
>On Wed, Apr 12, 2023 at 11:02:46 +0200, Shawn Michaels wrote:
>> On 11 April 2023 17:49:33 CEST, Luna Celeste <luna@xxxxxxxxxxxx> wrote:
>> >>>it's been randomly hanging. 
>> >> 
>> >> Also, things that may help you track this down:
>> >> - monitor /proc/interrupts when it freezes
>> >
>> >This is a 16 core processor and there's too much output on my 27"
>> >display to view it all at once; suggestions?
>> 
>> I would try to run something like this in the background:
>> watch -n 1 "cat /proc/interrupts >> ~/watch.log && sync"
>> 
>> (I did not check that the command works as expected but you get the
>> intention).
>> 
>> Once a crash is caught, analyze the produced logs. Perhaps you can
>> monitor other files from sysfs/debugfs as well.
>
>This is a good strategy, thank you! I'm a little worried about disk
>wear, though, but maybe that's just human bias?

If you're worried about that, you can store the logs on an external USB dongle. Or you could even remotely SSH into the box and use something like "script" in order to log the SSH session into a file on the remote machine. That way, you wouldn't need to sync every second.

>> Another thing that comes to mind: perhaps your system is still
>> running, albeit very slow. I see that you're running libvirt. I've had
>> a problem like this on my host: for more than a year, it would
>> randomly and seldomly "freeze" (become astonishingly slow) when
>> starting a VM (Windows guest with multiple passthroughs). I tried to
>> debug this by increasing journald/kernel log levels but the issue
>> appears to have vanished lately. I just assumed that it was fixed
>> upstream, but perhaps it's still there.
>
>Most of the time the VMs aren't actually running when the machine
>freezes / hangs; also, the last time it froze, the display was still
>active, and the clock hadn't advanced for something like 6-10 hours,
>matching the time when the mosh session lost its connection. So I don't
>think this is the cause.

This may still be the case. If you get e.g. a couple of system ticks every minute, you may not see a minute pass until a very long time.

>Unrelated, would you please check your mail client? When you reply, I
>get a copy in my main inbox and in the folder for the mailing list,
>despite setting both Mail-Followup-To and Reply-To headers. Something
>seems to be acting strangely. 

Sorry about that. I'm not used to mailing lists. I had a quick look through the settings and couldn't find anything related. Maybe this was caused because I replied to an "old" mail from the middle of the thread? I'm using k9 on Android. If somebody has an idea, don't hesitate to chime in.





[Index of Archives]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux