Re: fs corruption recovery

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 19, 2015 at 10:47:17PM -0700, Allison Henderson wrote:
> On 03/19/2015 06:47 PM, Theodore Ts'o wrote:
> >On Wed, Mar 18, 2015 at 06:59:52PM -0600, Andreas Dilger wrote:
> >>I think that running a 17TB filesystem on ext3 is a recipe for disaster.  They should use ext4 for anything larger than 16TB.
> >
> >It's not *possible* to have a 17TB file system with ext3.  Something
> >must be very wrong there.  16TB is the maximum you can have before you
> >end up overflowing a 32-bit block number.  Unless this is a PowerPC
> >with a 16K block size or some such?
> >
> >If e2fsck is segfaulting, then I would certainly try getting the
> >latest version of e2fsprogs, just in case the problem isn't just that
> >it's running out of memory.  Also if recovering customer data is the
> >most important thing, the first thing they should do is a make image
> >copy of the file system, since it's possible that incorrect use of
> >e2fsck, or an old/buggy version of e2fsck could make things work.

...make things *worse*.

> >
> >In particular, if they are seeing errors with multply claimed inodes,
> >it's likely that part of the inode table was written to the wrong
> >place, and sometimes a skilled human being can get more data than
> >simply using e2fsck -y and praying.  At the end of the day the
> >question is how much is the customer data work and how much effort is
> >the customer / IBM willing to invest in trying to get every last bit
> >of data back?
> >
> >						- Ted
> >
> 
> Hi all,
> 
> Sorry for the delay, our email servers went down for a bit after I
> sent the email.  I will work with Marcel to find the block size,
> page size and arch.  It is my understanding they they have a

Just guessing PPC, in which case you'll really want an e2fsck released after
the giant heaps of bugfixes I've sent over the last year.  There were a lot
of bugs that only show up on bigendian systems, which probably don't get
much testing nowadays.

Even if it's a 17179869184 byte ext3 FS on x86, you're probably still better
off with a less buggy e2fsck.  There are a number of fixes to prevent the
crosslinked file fixer and the directory fixer from doing insane things to
the FS.

> contract with this customer to maintain this data, so there is
> pressure to recover it. Unfortunately the product mirrored the fs
> corruption to the back up device before the corruption was
> discovered.  I've been told that I was the only person they could
> find left that had some background with ext3/4, so I have an inkling

Yep. ;)

> that the "skilled human being" might end up being me, even though
> its been a while since I've worked with it. :-) Maybe I could poke
> into the inode table and see what I can figure out. We will be sure
> to make image backups though.  Thx a bunch for the feed back, we
> really appreciate the help!  I will keep folks updated when I have
> more info.  Thx!

If you have LVM or other volume management, please take a snapshot and fsck the
snapshot first, so you can capture a log of what happens without blasting away
at existing data.

--D

> 
> Allison Henderson
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux