On Mon, Oct 05, 2020 at 11:46:01AM +0200, Jan Kara wrote: > On Mon 05-10-20 01:14:54, Josh Triplett wrote: > > Ran into an ext4 regression when testing upgrades to 5.9-rc kernels: > > > > Commit e7bfb5c9bb3d ("ext4: handle add_system_zone() failure in > > ext4_setup_system_zone()") breaks mounting of read-only ext4 filesystems > > with intentionally overlapping bitmap blocks. > > > > On an always-read-only filesystem explicitly marked with > > EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS, prior to that commit, it's safe to > > point all the block and inode bitmaps to a single block of all 1s, > > because a read-only filesystem will never allocate or free any blocks or > > inodes. > > However, after that commit, the block validity check rejects such > > filesystems with -EUCLEAN and "failed to initialize system zone (-117)". > > This causes systems that previously worked correctly to fail when > > upgrading to v5.9-rc2 or later. > > > > This was obviously a bugfix, and I'm not suggesting that it should be > > reverted; it looks like this effectively worked by accident before, > > because the block_validity check wasn't fully functional. However, this > > does break real systems, and I'd like to get some kind of regression fix > > in before 5.9 final if possible. I think it would suffice to make > > block_validity default to false if and only if > > EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS is set. > > > > Does that seem like a reasonable fix? > > Well, but EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS is your internal feature > that's not present in current upstream kernel AFAICS. It isn't "my" feature; the value for EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS is defined in the headers in the e2fsprogs tree. The kernel currently does absolutely nothing with it, nor did it previously need to; it's just an RO_COMPAT feature which ensures that the kernel can only mount the filesystem read-only. The point is that an always-read-only filesystem will never change the block or inode bitmaps, so ensuring they don't overlap is unnecessary (and harmful). I only added EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS to the header to generate the corresponding ext4_has_feature_shared_blocks function. I have filesystems that previous kernels mounted and worked with just fine, and new kernels reject. That seems like a regression to me. I'm suggesting the simplest possible way I can see to fix that regression. Another approach would be to default block_validity to false for any read-only filesystem mount (since it won't be written to), but that seemed like it'd be more invasive; I was going for a more minimal change.