Hi, DISCLAIMER: My knowledge with respect to ext4 internals is limited, and what I state here is based on what I've deduced from the code, and some intuition, meaning some (or even all of it) may be completely incorrect. Checking against kernel version 4.16, it looks like the ext4_update_super() function in the kernel is responsible for updating the superblock. This code is potentially problematic: le32_add_cpu(&es->s_inodes_count, EXT4_INODES_PER_GROUP(sb) * flex_gd->count); >From what I can tell there is absolutely no overflow / max protection here. Same calculation for s_free_inodes_count. The value of inodes per group is 512 (2^9). The only two paths to ext4_flex_group_add (only user of ext4_update_super) is ext4_group_add which always has flex_gd->count == 1, and ext4_resize_fs, which calculates flex_gd->count based on s_log_groups_per_flex (0, so flex_gd->count would be 1 still). So when we reach 8388608 (2^23) flex_gd's then we end up with 2^32 (0) inodes, and we've got a corrupt filesystem. This also implies that the s_inodes_count values should be a multiple of inodes per group if I'm not mistaken. And finally had the savvy to check the kernel logs, and this may shed some light on the issues: May 21 12:43:59 crowsnest kernel: EXT4-fs (dm-5): resizing filesystem from 16508780544 to 17179869184 blocks May 21 12:44:09 crowsnest kernel: EXT4-fs (dm-5): resized to 16541286400 blocks May 21 12:44:19 crowsnest kernel: EXT4-fs (dm-5): resized to 16576937984 blocks May 21 12:44:29 crowsnest kernel: EXT4-fs (dm-5): resized to 16611540992 blocks May 21 12:44:39 crowsnest kernel: EXT4-fs (dm-5): resized to 16649289728 blocks May 21 12:44:49 crowsnest kernel: EXT4-fs (dm-5): resized to 16687038464 blocks May 21 12:44:59 crowsnest kernel: EXT4-fs (dm-5): resized to 16725311488 blocks May 21 12:45:09 crowsnest kernel: EXT4-fs (dm-5): resized to 16763584512 blocks May 21 12:45:19 crowsnest kernel: EXT4-fs (dm-5): resized to 16797138944 blocks May 21 12:45:29 crowsnest kernel: EXT4-fs (dm-5): resized to 16839081984 blocks May 21 12:45:39 crowsnest kernel: EXT4-fs (dm-5): resized to 16876830720 blocks May 21 12:45:50 crowsnest kernel: EXT4-fs (dm-5): resized to 16917200896 blocks May 21 12:46:00 crowsnest kernel: EXT4-fs (dm-5): resized to 16954425344 blocks May 21 12:46:10 crowsnest kernel: EXT4-fs (dm-5): resized to 16989552640 blocks May 21 12:46:20 crowsnest kernel: EXT4-fs (dm-5): resized to 17027825664 blocks May 21 12:46:30 crowsnest kernel: EXT4-fs (dm-5): resized to 17065574400 blocks May 21 12:46:40 crowsnest kernel: EXT4-fs (dm-5): resized to 17103847424 blocks May 21 12:46:50 crowsnest kernel: EXT4-fs (dm-5): resized to 17143169024 blocks May 21 12:47:00 crowsnest kernel: EXT4-fs error (device dm-5): ext4_search_dir:1296: inode #304881794: block 1219511409: comm rsync: bad entry in directory: inode out of bounds - offset=860(860), inode=1455559466, rec_len=44, name_len=36 May 21 12:47:00 crowsnest kernel: EXT4-fs error (device dm-5): htree_dirblock_to_tree:1006: inode #514662607: block 2058395547: comm du: bad entry in directory: inode out of bounds - offset=0(0), inode=514662607, rec_len=12, name_len=1 May 21 12:47:00 crowsnest kernel: EXT4-fs error (device dm-5) in ext4_reserve_inode_write:5759: Corrupt filesystem May 21 12:47:00 crowsnest kernel: EXT4-fs error (device dm-5) in ext4_reserve_inode_write:5759: Corrupt filesystem May 21 12:47:00 crowsnest kernel: EXT4-fs error (device dm-5) in ext4_reserve_inode_write:5759: Corrupt filesystem May 21 12:47:00 crowsnest kernel: EXT4-fs error (device dm-5) in ext4_reserve_inode_write:5759: Corrupt filesystem May 21 12:47:00 crowsnest kernel: EXT4-fs error (device dm-5) in ext4_reserve_inode_write:5759: Corrupt filesystem May 21 12:47:00 crowsnest kernel: EXT4-fs error (device dm-5): htree_dirblock_to_tree:1006: inode #814645264: block 3258462095: comm du: bad entry in directory: inode out of bounds - offset=0(0), inode=814645264, rec_len=12, name_len=1 May 21 12:47:00 crowsnest kernel: EXT4-fs error (device dm-5) in ext4_reserve_inode_write:5759: Corrupt filesystem May 21 12:47:00 crowsnest kernel: EXT4-fs error (device dm-5) in ext4_reserve_inode_write:5759: Corrupt filesystem May 21 12:47:00 crowsnest kernel: EXT4-fs (dm-5): resized filesystem to 17179869184 blocks The last post-resize output from the kernel is 2^34. >From there on the number of errors just keeps going on inode out of bounds, obviously since number of inodes is zero any check of the form "inode_num <= le32_to_cpu(sb->s_inodes_count)" would fail... So it really looks like this is something to do with the fact that we sized to 64TB. Kind Regards, Jaco On 23/05/2018 15:16, Jaco Kroon wrote: > Hi Jan, > > On 23/05/2018 13:37, Jan Kara wrote: >> Hi, >> OK, so the Inode count is obviously wrong and the remaining errors are due >> to that. Apparently the resize process has overflown the inode count to 0 >> (which is not that surprising since the number of inodes in your filesystem >> would be 1<<32) - that needs fixing but let's first get your fs up and >> running. I'm actually surprised that e2fsck did anything with the >> filesystem because for me both 1.44.2 and 1.42.11 versions just exit after >> printing the error about the corrupted superblock. Anyway what *could* fix >> your problem is: >> >> debugfs -w -R 'ssv inodes_count 4294967295' /dev/lvm/home >> >> and then check with dumpe2fs that inode count indeed got fixed. Hope it >> helps. > I started to investigate the superblocks as well. Using hexdump and dd > ... scary. Came to the same conclusion, tried to fix it by replacing it > in the superblock using dd but that caused other issues so reverted it > back to all zero. > > Also tried with debugfs but could not figure out how to use it so the > above helped a lot thank you so much! Unfortunately it doesn't help: > > crowsnest ~ # dumpe2fs /dev/lvm/home > dumpe2fs 1.44.2 (14-May-2018) > dumpe2fs: The ext2 superblock is corrupt while trying to open /dev/lvm/home > Couldn't find valid filesystem superblock. > > fsck and debugfs also now fails, managed to revert that using: > > crowsnest ~ # echo -ne "\x00\x00\x00\x00" | dd of=/dev/lvm/home bs=4 > count=1 seek=256 conv=notrunc > 1+0 records in > 1+0 records out > 4 bytes copied, 0.0213468 s, 0.2 kB/s > > And now we're back to where we started. So I'm contemplating if 2^32-1 > is not perhaps an explicitly invalid value, but I've tried 2^32-2 > (4294967294) as well, same result. > > Busy trying to check the e2fsck source files. There are quite a few > things that can go wrong during ext2fs_open2() and it's unclear what > exactly is going wrong here. Looks like I may have to modify the code > to get the error value ... > > Since it happened during (directly after?) resize2fs we are actually > thinking potential kernel bug. Original FS size was 61TB and upsized to > exactly 64TB. In terms of 4096KB blocks that's EXACTLY 2^34 blocks, so > I also aim to look at the kernel sources there, but as you say - first > we need to get the filesystem up. > > Kind Regards, > Jaco > > > >