Re: Suggested use of --invalid-backup?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 8 Apr 2013 14:13:31 -0500 Barrett Lewis
<barrett.lewis.mitsi@xxxxxxxxx> wrote:

> As much as I hate to bump, are there no thoughts on this?
> 
> The most important question is if I have a possibly corrupted version
> of a backup file, should I supply it with the --invalid-backup flag?
> Or does that expect a blank file only?
> 
> On Tue, Apr 2, 2013 at 3:20 PM, Barrett Lewis
> <barrett.lewis.mitsi@xxxxxxxxx> wrote:
> > I was reshaping a 5x2tb raid5 to a 6x2tb raid6.  Not knowing that
> > ubuntu deletes the /tmp/ folder each reboot, I specified my
> > --backup-file as /tmp/raid-backup.bak (this is not part of the array).
> >  At 15.1% the system hung sufficiently that REISUB and the reset
> > button were ignored and I had to hold the power button down to reset
> > the server.  After booting back from the crash, the array would not
> > start, and ubuntu had deleted the backup file (and everything else in
> > /tmp).
> >
> > The superblock already says it's raid6, all members are present and
> > the event counters are the same on all disks.  I tried
> >
> > ubuntu@ubuntu:~$ sudo mdadm --assemble --force --run --verbose
> > /dev/md0 /dev/sd[abcdef]
> > mdadm: looking for devices for /dev/md0
> > mdadm: /dev/sda is identified as a member of /dev/md0, slot 4.
> > mdadm: /dev/sdb is identified as a member of /dev/md0, slot 0.
> > mdadm: /dev/sdc is identified as a member of /dev/md0, slot 5.
> > mdadm: /dev/sdd is identified as a member of /dev/md0, slot 2.
> > mdadm: /dev/sde is identified as a member of /dev/md0, slot 3.
> > mdadm: /dev/sdf is identified as a member of /dev/md0, slot 1.
> > mdadm:/dev/md0 has an active reshape - checking if critical section
> > needs to be restored
> > mdadm: Failed to find backup of critical section
> > mdadm: Failed to restore critical section for reshape, sorry.
> >       Possibly you needed to specify the --backup-file
> >
> >
> > My understanding is that the backup file is only for some early
> > critical part of the reshape and that it isn’t even used after that.
> > 15% into 8tb is well over a terrabyte so wouldn’t that be far past any
> > filesystem metadata?  So what exactly is implied (about the state of
> > the reshape) by the fact that programmatically it is still requiring
> > the backup file?
> >
> > I have read the manpage on the --invalid-backup command but I didn't
> > clearly get "use it here, not here" type of information.  I have the
> > OS drive (with deleted /tmp/raid-backup.bak) in a data recovery
> > process.  If I actually get the backup file recovered, it could
> > potentially have corrupted bits.  Is the best course of action to:
> > Supply the (potentially corrupted, but maybe some percent ok)
> > recovered backup file as the legitimate backup file (without
> > --invalid-backup)? (could this be worse than --invalid-backup and a
> > blank file?)
> > Supply the (potentially corrupted) recovered backup file WITH --invalid-backup?
> > Supply --invalid-backup and an empty file?
> >
> > Or if I am on the wrong path, let me know of any other thoughts or
> > suggestions you might have.
> >
> > If I get nothing useful back from data recovery, and I have to supply
> > --invalid-backup with a blank file, considering the reshape made it to
> > 15%, how much chance is there that the array could assemble and resume
> > reshape?  I would gladly accept the corruption of some files vs losing
> > the whole file system (obviously).
> >

There is no risk in providing an backup file - if it doesn't look good it
will be ignored.

When md does an in-place reshape like this it:
  - read several stripes
  - writes them to the backup file
  - writes them back to the devices
  - updates the metadata

If your crash was during the "writes them back" section, then you will have
some corruption that you cannot avoid without having exactly the right backup
file.

With luck the corruption should be fairly limited.

There is nothing better that you can do then reassemble the array with the
best backup file you can find, and with --invalid-backup.  Then 'fsck' and do
whatever you can to validate your data.

NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux