PATCH [0/4 V2] xfs: log recovery hang fixes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi folks,

Willy reported generic/530 had started hanging on his test machines
and I've tried to reproduce the problem he reported. While I haven't
reproduced the exact hang he's been having, I've found a couple of
others while running g/530 in a tight loop on a couple of test
machines.

The first 3 patches are defensive fixes - the log worker acts as a
watchdog, and the issues in patch 2 and 3 were triggered on my
testing of g/530 and lead to 30s delays that the log worker watchdog
caught. Without the watchdog, these may actually be deadlock
triggers.

The 4th patch is the one that fixes the problem Willy reported.
It is a regression from conversion of the AIL pushing to use
non-blocking CIL flushes. It is unknown why this suddenly started
showing up on Willy's test machine right now, and why only on that
machine, but it is clearly a problem. This patch catches the state
that leads to the deadlock and breaks it with an immediate log
force to flush any pending iclogs.

Version 2:
- updated to 5.17-rc7
- tested by Willy.

Version 1:
- https://lore.kernel.org/linux-xfs/20220307053252.2534616-1-david@xxxxxxxxxxxxx/




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux