Re: Filesystem corruption after unreachable storage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 20/02/2020 16:50, Theodore Y. Ts'o wrote:
On Thu, Feb 20, 2020 at 10:08:44AM +0100, Jean-Louis Dupond wrote:
dumpe2fs -> see attachment
Looking at the dumpe2fs output, it's interesting that it was "clean
with errors", without any error information being logged in the
superblock.  What version of the kernel are you using?  I'm guessing
it's a fairly old one?
Debian 10 (Buster), with kernel 4.19.67-2+deb10u1
Fsck:
# e2fsck -fy /dev/mapper/vg01-root
e2fsck 1.44.5 (15-Dec-2018)
And that's a old version of e2fsck as well.  Is this some kind of
stable/enterprise linux distro?
Debian 10 indeed.
Pass 1: Checking inodes, blocks, and sizes
Inodes that were part of a corrupted orphan linked list found.  Fix? yes

Inode 165708 was part of the orphaned inode list.  FIXED.
OK, this and the rest looks like it's relating to a file truncation or
deletion at the time of the disconnection.

  > > > On KVM for example there is a unlimited timeout (afaik) until the
storage is
back, and the VM just continues running after storage recovery.
Well, you can adjust the SCSI timeout, if you want to give that a try....
It has some other disadvantages? Or is it quite safe to increment the SCSI
timeout?
It should be pretty safe.

Can you reliably reproduce the problem by disconnecting the machine
from the SAN?
Yep, can be reproduced by killing the connection to the SAN while the VM is running, and then after the scsi timeout passed, re-enabled the SAN connection. Then reset the machine, and then you need to run an fsck to have it back online.
						- Ted



[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux