Re: Filesystem corruption after unreachable storage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 25, 2020 at 02:19:09PM +0100, Jean-Louis Dupond wrote:
> FYI,
> 
> Just did same test with e2fsprogs 1.45.5 (from buster backports) and kernel
> 5.4.13-1~bpo10+1.
> And having exactly the same issue.
> The VM needs a manual fsck after storage outage.
> 
> Don't know if its useful to test with 5.5 or 5.6?
> But it seems like the issue still exists.

This is going to be a long shot, but if you could try testing with
5.6-rc3, or with this commit cherry-picked into a 5.4 or later kernel:

   commit 8eedabfd66b68a4623beec0789eac54b8c9d0fb6
   Author: wangyan <wangyan122@xxxxxxxxxx>
   Date:   Thu Feb 20 21:46:14 2020 +0800

       jbd2: fix ocfs2 corrupt when clearing block group bits
       
       I found a NULL pointer dereference in ocfs2_block_group_clear_bits().
       The running environment:
               kernel version: 4.19
               A cluster with two nodes, 5 luns mounted on two nodes, and do some
               file operations like dd/fallocate/truncate/rm on every lun with storage
               network disconnection.
       
       The fallocate operation on dm-23-45 caused an null pointer dereference.
       ...

... it would be interesting to see if fixes things for you.  I can't
guarantee that it will, but the trigger of the failure which wangyan
found is very similar indeed.

Thanks,

						- Ted



[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux