Re: new block group bitmaps not being initialized after resize!?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2013-01-10, at 6:07 PM, Eric Sandeen wrote:
> On 1/10/13 6:07 PM, Carlos Maiolino wrote:
>> I'm working on a Fedora bugzilla (https://bugzilla.redhat.com/show_bug.cgi?id=852833)
>> together with Eric and we found that the problem described on the
>> bugzilla happens when the commit 93f9052643 is not applied, which
>> is the case of the Fedora 16 kernel being discussed there.
> 
> Also, to be clear, this is with older e2fsprogs which was using the
> old resize interface.  Not sure what the behavior is w/ newer
> e2fsprogs, but we don't see the corruption.
> 
> Note, we see the corruption on these older kernels even when resizing from say 100G to 120G.  It appears fixed upstream, but so much has
> changed, we need to be sure the older interface doesn't have bugs
> lurking, I think.

The original resize code didn't ever know about uninit_bg, so it
would always zero out the inode table, so I suspect that this was
added at some later point. 

>> Although we already found the solution to the problem in the commit
>> above, looking through the commit have raised some questions
>> regarding to the bitmap of the newer block groups added to the FS
>> after it is extended.
>> 
>> The newer block groups do not have flags ITABLE_ZEROED and
>> INODE_UNINIT set, even when 'lazy_itable_init' is enabled.
> 
> In particular, we see things like this in the last pre-resize group:
> 
> Group 799: (Blocks 26181632-26214399) [INODE_UNINIT, ITABLE_ZEROED]
>  Checksum 0xafe7, unused inodes 1936
>  Block bitmap at 25165855 (bg #768 + 31), Inode bitmap at 25166111 (bg #768 + 287)
>  Inode table at 25170087-25170207 (bg #768 + 4263)
>  32768 free blocks, 1936 free inodes, 0 directories, 1936 unused inodes
>  Free blocks: 26181632-26214399
>  Free inodes: 1546865-1548800
> 
> and this in the newly-added groups:
> 
> Group 800: (Blocks 26214400-26247167)
>  Checksum 0xddc4, unused inodes 0
>  Block bitmap at 26236224 (+21824), Inode bitmap at 26236225 (+21825)
>  Inode table at 26214400-26214520
>  32645 free blocks, 1936 free inodes, 0 directories
>  Free blocks: 26214521-26236223, 26236226-26247167
>  Free inodes: 1548801-1550736
> 
> so it says 0 unused inodes, but also 1936 free inodes (?), and
> no UNINIT or ZEROED flags set.  e2fsck finds stale data in the inode table, and goes nuts.

Zeroing the inode table but not setting the INODE_ZEROED flag
would not be harmful, but this seems to not be the case.

When the filesystem is remounted, does the kernel lazyinit
thread zero out the new groups in the inode table?


>> Without this commit, inode stale data might be exposed and also makes fsck complain about all inodes of the newer block groups.
> 
> *nod* :)

That's why 

> so 93f9052643 seems to have accidentally fixed this, by setting
> the unused counter to EXT4_INODES_PER_GROUP(), but it feels like
> we've missed properly setting up this block group.

Actually, just setting the unused counter is not enough to properly
fix this problem.  The lazyinit thread should be started to do
background zeroing of the inode table, otherwise if the group
descriptor is corrupted and the bg_itable_unused value is wrong,
then the uninitialized inodes would be accessed.

> To be honest, though, sometimes I get lost in the sea of flags.
> 
>> The question is, are these flags expected to not be found on
>> these newer block groups or they should be set?
> 
> *nod* :)

Depends on how it is implemented. :-/  The flag should definitely
not be set unless the itable is actually overwritten with zeroes.
It makes sense that the lazyinit thread would do this in the
background while the filesystem is mounted instead of waiting for
the next time that the filesystem is mounted.

Looking at the code, it appears that this is not happening at the
end of the resize, since ext4_register_li_request() is marked
static for the superblock.  It looks like it would be relatively
straight forward to add a call to ext4_register_li_request() to
ext4_resize_end() with the first group added to the resize.

It looks like ext4_run_li_request() will skip groups that are
already marked as INODE_ZERO, so it is fine to always call it even
if the kernel is already setting this itself in some cases (not
that I see this happening).

>> The lack of these flags on newer block groups is an expected
>> behaviour or is something that should be fixed?
>> 
>> FWIW, in the old ext4_group_add(), we added EXT4_BG_INODE_ZEROED
>> flag to the bg_flags, also, I did some tests here and the lack
>> of these flags look to not be affecting filesystem integrity,
>> i.e. new inodes can be properly allocated, which sounds that
>> these uninitialized inodes/bitmaps are set to be initialized
>> as soon as a first inode is allocated on these new BGs.

Well, the old code always zeroed the inode table, and it made sense
to mark it as such.

Cheers, Andreas





--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux