Problems with ext3 fs

sct@redhat.com (Stephen C. Tweedie) · Fri, 1 Mar 2002 10:28:01 +0000

Hi,

On Fri, Mar 01, 2002 at 09:10:03AM +0000, John Matthews wrote:

> More info for you all. I just checked dmesg (the fs tripped into ro mode
> again last night) to get my system output and found this:
> 
> EXT3-fs error (device md(9,8)): ext3_readdir: directory #80865 contains a
> hole at offset 4096

On reboot, could you please run debugfs on this filesystem and send me
the output of

$ debugfs /dev/md8
debugfs:  stat <80865>

so I can see what this file looks like?  If possible, it would also be
useful to capture the journal file on reboot so that I can see what
has been happening on that file recently.  tune2fs -l will tell you
what journal inode is being used (it's normally 8 if you're using an
internal journal), and

echo 'dump <8> /tmp/md9.journal.dump' | debugfs -f - /dev/md9

should capture it.  You'll need to run that before the fsck in your
startup scripts, of course, as fsck will replay and reset the journal.

The question is whether we can eliminate software or hardware from the
picture.  If we're seeing bad data like that in the directory, but on
reboot the directory looks intact and it was OK in the journal, then
we're almost certainly looking at data corruption happening somewhere
between the disk and memory, and that increases the expectation that
this is a hardware problem.

Cheers,
 Stephen