Re: reshape changing chunk size won't restart

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 21 Dec 2010 16:09:59 -0800 Andrew Burgess <aab@xxxxxxxxxxx> wrote:

> On 12/21/2010 02:16:19 PM, Neil Brown wrote:
> 
> > > I started a reshape changing chunk size and after it ran
> > > for a while i realized the disk i used for the
> > > backup file was slow so I killed the mdadm
> > 
> > That was a mistake.
> 
> Its looking to be a bad one
> 
> > > running in the background and tried to restart
> > > with the new location (i moved the file just in case)
> > >
> > > mdadm /dev/md5 --grow --chunk=8  
> > --backup-file=/my/raid/RAID_BACKUP_FILE
> > 
> > As you discovered, that doesn't work.  I'd like to make it possible  
> > to do
> > something like that, but time is not something I have a lot of.
> 
> Understand 100%
> 
> > > I didn't try rebooting as the filesystem is mounted and
> > > the data seems ok. Didn't want to make things worse...
> > 
> > It shouldn't make things worse.
> 
> I had too because umount wouldn't and neither fuser nor lsof
> could find the guilty party
> 
> > Do don't need to reboot, unless md5 has your root filesystem.
> > Just unmount, 'mdadm -S /dev/md5', and assemble:
> >   mdadm -A /dev/md5 --backup-file=/whereever-you-copied-the-file-to \
> >       /dev/sd[dfcbhljgk]1
> > 
> > should do it.
> 
> After rebooting something happened to sdg1:
> 
> mdadm -A /dev/md5 --backup-file=/my/raid/RAID_BACKUP_FILE  
> /dev/sd[dfcbhljgk]1
> mdadm: cannot open device /dev/sdg1: No such device or address
> mdadm: /dev/sdg1 has no superblock - assembly aborted
> 
> so i tried it with sdg1 missing
> 
> mdadm -A /dev/md5 --backup-file=/my/raid/RAID_BACKUP_FILE  
> /dev/sd[dfcbhljk]1
> mdadm: Failed to restore critical section for reshape, sorry.
> 
> so i rebooted and power cycled hoping to get sdg1 back but it was
> still unhappy with the superblock
> 
> I even tried it letting it scan for devices:
> 
> mdadm -A /dev/md5 --backup-file=/my/raid/RAID_BACKUP_FILE
> mdadm: WARNING /dev/sdg1 and /dev/sdg appear to have very similar  
> superblocks.
>        If they are really different, please --zero the superblock on one
>        If they are the same or overlap, please remove one from the
>        DEVICE list in mdadm.conf.
> 
> so repeating with all but sdg1 specified it results in:
> 
> mdadm: Failed to restore critical section for reshape, sorry.
> 
> Anything else I can try? We do have the sector it was on in the original
> email when it stopped: (2715648/1953511936)


The business with sdg1 is a bit odd... I would use "--examine" to check each
device and make sure they have good matching superblocks.  It would be a lot
better if you can make sure all devices get included when you start the array.

Also, try starting with '--verbose', it might give some useful information,
but I don't hold out a lot of hope.

Finally, you will probably end up having to modify mdadm so that it ignores a
failure from Grow_restart.  AS you had a reasonably clean shutdown rather
than a crash, there is a good chance that the backup file isn't actually
needed.
The next release of mdadm will have a --invalid-backup option to --assemble
to tell it to just continue even though the backup file looks wrong.

And the (feature) release after that, when used with a newer kernel (maybe
2.6.38) and v1.x metadata should be able to do these reshapes without a
backup file, which would be a major win!

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux