Dirent blocks leaking into data file blocks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm seeing some corruption in data files during heavy use on ext4 file
systems, which appears to be a bug.  The symptom is this:

A random block in the middle of an otherwise undistinguished 8MB data file
has a pattern like this:

   $ od -Ax -x <file>
   ...
   001000 b4aa 0005 000c 0201 002e 0000 4e31 0005
   001010 0ff4 0202 2e2e 0000 ce67 0004 000c 0102
   001020 6e69 0000 ce69 0004 0fdc 0103 756f 0074
   001030 0000 0000 0000 0000 0000 0000 0000 0000
   *
   002000 8b83 f727 10d0 b918 ad2a 8edc 67f7 e178
   ...

The block from 0x1000 to 0x2000 looks an awful lot like a block of directory
entries, with the dirents:

   inode     : 373930
   rec_len   : 12
   name_len  : 1
   file_type : 2 (dir)
   name      : "."

   inode     : 347697
   rec_len   : 4084 (i.e., all the rest of the block
   name_len  : 2
   file_type : 2 (dir)
   name      : ".."

with remnants of other, deleted dirents following it.

These corruptions are pretty rare, and I can't replicate the problem in any
sort of simple test case.  But looking at the code, it seems that there's a
problem with deletion of "metadata" blocks full of dirents:  ext4_forget()
is never called for them.

For all other blocks used as metadata, ext4_forget() seems to be called when
they're about to be freed up:

   - extent blocks
   - indirect blocks
   - xattr blocks

But I don't see anywhere that we call ext4_forget() (or ext4_journal_forget
directly) for directory entries.

So when a directory is removed with "rm -rf foo" , as the files are deleted,
the directory block(s) are marked dirty.  But when the directory blocks
themselves are freed up, bforget() isn't called for their bufferheads, and
so they remain dirty in the page cache, and can be written down later, after
their blocks have been reused.

This is the same problem I saw with extent metadata blocks "leaking" into
data blocks, fixed with c7acb4c16646943180bd221c167a077e0a084f9c , which
added calls to bforget() in ext4_journal_{forget,revoke}() .  But in this
new case, it would seem to be an issue both with and without a journal, and
with both extent- and non-extent based directories.

Am I missing something?  And if not, suggestions on the best place to fix
this?  I was thinking of doing this in ext4_truncate() for all truncated
blocks if this is a directory; or in ext4_mb_free_blocks() if "metadata" is
1.  But this latter one would be overkill for all those "normal" metadata
blocks which already have been "forgotten."

Thanks,
Curt
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux