Hi Bart,
On 2023/06/12 20:40, Bart Van Assche wrote:
On 6/9/23 00:29, Jaco Kroon wrote:
I'm attaching dmesg -T and ps axf. dmesg in particular may provide
clues as it provides a number of stack traces indicating stalling at
IO time.
Once this has triggered, even commands such as "lvs" goes into
uninterruptable wait, I unfortunately didn't test "dmsetup ls" now
and triggered a reboot already (system needs to be up).
To me the call traces suggest that an I/O request got stuck.
Unfortunately call traces are not sufficient to identify the root
cause in case I/O gets stuck. Has debugfs been mounted? If so, how
about dumping the contents of /sys/kernel/debug/block/ into a tar file
after the lockup has been reproduced and sharing that information?
Looks to be mounted, at least I've got a /sys/kernel/debug/block/ folder
on the relevant server.
tar -czf- -C /sys/kernel/debug/block . >block.tgz
Definitely can do, I'm not sure how to interpret the data in this - is
there anything specific you're looking for? Would love to not just pass
the information on but also learn from this.
Generally the lockup rate seem to be about once a week currently so I
expect (on average) to see this pop again some time over the weekend.
Kind regards,
Jaco