On Mon, 2023-06-12 at 11:40 -0700, Bart Van Assche wrote: > On 6/9/23 00:29, Jaco Kroon wrote: > > I'm attaching dmesg -T and ps axf. dmesg in particular may provide > > clues as it provides a number of stack traces indicating stalling > > at > > IO time. > > > > Once this has triggered, even commands such as "lvs" goes into > > uninterruptable wait, I unfortunately didn't test "dmsetup ls" now > > and triggered a reboot already (system needs to be up). > > To me the call traces suggest that an I/O request got stuck. > Unfortunately call traces are not sufficient to identify the root > cause > in case I/O gets stuck. Has debugfs been mounted? If so, how about > dumping the contents of /sys/kernel/debug/block/ into a tar file > after > the lockup has been reproduced and sharing that information? > > tar -czf- -C /sys/kernel/debug/block . >block.tgz > > Thanks, > > Bart. > One I am aware of is this commit 106397376c0369fcc01c58dd189ff925a2724a57 Author: David Jeffery <djeffery@xxxxxxxxxx> Can we try get a vmcore (assuming its not a secure site) Add these to /etc/sysctl.conf kernel.panic_on_io_nmi = 1 kernel.panic_on_unrecovered_nmi = 1 kernel.unknown_nmi_panic = 1 Run sysctl -p Ensure kdump is running and can capture a vmcore When it locks up again send an NMI via the SuperMicro Web Managemnt interface Share the vmcore, or we can have you capture some specifics from it to triage. Thanks Laurence