Re: Raid1 where Event Count off my 1 cannot assemble --force

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 08 Dec 2013 18:38:58 -0600 "David C. Rankin"
<drankinatty@xxxxxxxxxxxxxxxxxx> wrote:

> On 12/08/2013 11:57 AM, David C. Rankin wrote:
> > On 12/08/2013 04:57 AM, Mikael Abrahamsson wrote:
> >> On Sun, 8 Dec 2013, David C. Rankin wrote:
> >>
> >>> Guys,
> >>>
> >>>  I have an older box that is a fax server where the Event Count for /dev/md1 is
> >>> off by 1, but the array cannot be reassembled with --assemble --force /dev/dm1
> >>> /dev/sda5 /dev/sdb5.
> >>
> >> What are the messages displayed in "dmesg" when you try to use this command?
> >>
> > 
> > Mikael,
> > 
> >   Following the commands:
> > 
> > # mdadm --stop /dev/md1
> > # mdadm --assemble --force /dev/dm1 /dev/sd[ab]5
> > 
> >   The messages captured in the logs are:
> > 
> > Rescue Kernel: md: md1: stopped.
> > Rescue Kernel: md: unbind<sda5>
> > Rescue Kernel: md: export_rdev(sda5)
> > Rescue Kernel: md: unbind<sdb5>
> > Rescue Kernel: md: export_rdev(sdb5)
> > Rescue Kernel: md: md1: stopped.
> > Rescue Kernel: md: md1 raid array is not clean -- starting background reconstruction
> > Rescue Kernel: md: raid1: raid set md1 active with 2 out of 2 mirrors
> > Rescue Kernel: md1: bitmap file is out of date (148 < 149) -- forcing full recovery
> > Rescue Kernel: md1: bitmap file is out of date, doing full recovery
> > Rescue Kernel: md1: bitmap initialisation failed: -5
> > Rescue Kernel: md1: failed to create bitmap (-5)
> > 
> > 
> >   That's it for the log, then on the command line I have:
> > 
> > mdadm: failed to RUN_ARRAY /dev/md1: Input/Output error
> > 
> >   What should I try next? Don't hesitate to ask if you need any additional
> > information, I'll provide whatever is necessary. Thanks.
> > 
> 
> Here is additional information with --verbose given:
> 
> nemtemp:~ # cat /proc/mdstat
> Personalities : [raid1]
> md2 : active raid1 sda7[0] sdb7[1]
>       221929772 blocks super 1.0 [2/2] [UU]
>       bitmap: 0/424 pages [0KB], 256KB chunk
> 
> md1 : inactive sda5[0] sdb5[1]
>       41945504 blocks super 1.0
> 
> md0 : active raid1 sda1[0] sdb1[1]
>       104376 blocks super 1.0 [2/2] [UU]
>       bitmap: 0/7 pages [0KB], 8KB chunk
> 
> unused devices: <none>
> 
> nemtemp:~ # mdadm --stop /dev/md1
> mdadm: stopped /dev/md1
> 
> nemtemp:~ # cat /proc/mdstat
> Personalities : [raid1]
> md2 : active raid1 sda7[0] sdb7[1]
>       221929772 blocks super 1.0 [2/2] [UU]
>       bitmap: 0/424 pages [0KB], 256KB chunk
> 
> md0 : active raid1 sda1[0] sdb1[1]
>       104376 blocks super 1.0 [2/2] [UU]
>       bitmap: 0/7 pages [0KB], 8KB chunk
> 
> unused devices: <none>
> 
> nemtemp:~ # mdadm --verbose --assemble --force /dev/md1 /dev/sd[ab]5
> mdadm: looking for devices for /dev/md1
> mdadm: /dev/sda5 is identified as a member of /dev/md1, slot 0.
> mdadm: /dev/sdb5 is identified as a member of /dev/md1, slot 1.
> mdadm: added /dev/sdb5 to /dev/md1 as 1
> mdadm: added /dev/sda5 to /dev/md1 as 0
> mdadm: failed to RUN_ARRAY /dev/md1: Input/output error
> 
>   The log from the start attempt:
> 
> Dec  9 00:16:11 Rescue kernel: md: md1 stopped.
> Dec  9 00:16:11 Rescue kernel: md: bind<sdb5>
> Dec  9 00:16:11 Rescue kernel: md: bind<sda5>
> Dec  9 00:16:11 Rescue kernel: md: md1: raid array is not clean -- starting
> background reconstruction
> Dec  9 00:16:11 Rescue kernel: raid1: raid set md1 active with 2 out of 2 mirrors
> Dec  9 00:16:11 Rescue kernel: md1: bitmap file is out of date (148 < 149) --
> forcing full recovery
> Dec  9 00:16:11 Rescue kernel: md1: bitmap file is out of date, doing full recovery
> Dec  9 00:16:12 Rescue kernel: md1: bitmap initialisation failed: -5
> Dec  9 00:16:12 Rescue kernel: md1: failed to create bitmap (-5)
> Dec  9 00:16:12 Rescue kernel: md: pers->run() failed ...
> 
> nemtemp:~ # cat /proc/mdstat
> Personalities : [raid1]
> md2 : active raid1 sda7[0] sdb7[1]
>       221929772 blocks super 1.0 [2/2] [UU]
>       bitmap: 0/424 pages [0KB], 256KB chunk
> 
> md1 : inactive sda5[0] sdb5[1]
>       41945504 blocks super 1.0
> 
> md0 : active raid1 sda1[0] sdb1[1]
>       104376 blocks super 1.0 [2/2] [UU]
>       bitmap: 0/7 pages [0KB], 8KB chunk
> 
> unused devices: <none>
> 
>   I'm not sure how to proceed safely from here. Is there anything else I should
> try before attempting to --create the array again? If we do create the array
> with 1 drive and "missing", should I then use --add or --re-add to add the other
> drive? Also, since /dev/sda5 shows Events: 148 and /dev/sdb5 shows Events: 149,
> should I choose /dev/sdb5 as the one to preserve and let "missing" take the
> place of /dev/sda5? If so, then does the following create statement look correct:
> 
> mdadm --create --verbose --level=1 --metadata=1.0 --raid-devices=2 \
> /dev/md1 /dev/sdb5 missing
> 
>   Should I also use --force?
> 
>   If attempting to assemble with "missing" and the create command gives problems
> due to the unused device still having the same minor-number, is it better to
> --zero-superblock the on the device not included as "missing" or is it better to
> just unplug it and preserve the superblock data in case it is needed?
> 
>   Sorry for all the questions, but I just want to make sure I don't do something
> to compromise the data. With the information for both drives looking good with
> --examine, the (Update Time : Tue Nov 19 15:28:38 2013) being identical, and the
> Events being off by only 1, I can't see a reason the drives should not just
> assemble and run as it is. What say the experts?
> 
>   Here is the --detail and --examine information for the drives for completeness:
> 
> nemtemp:~ # mdadm --detail /dev/md1
> /dev/md1:
>         Version : 01.00.03
>   Creation Time : Thu Aug 21 06:43:22 2008
>      Raid Level : raid1
>   Used Dev Size : 20972752 (20.00 GiB 21.48 GB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 1
>     Persistence : Superblock is persistent
> 
>     Update Time : Tue Nov 19 15:28:38 2013
>           State : active, Not Started
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
> 
>            Name : 1
>            UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
>          Events : 148
> 
>     Number   Major   Minor   RaidDevice State
>        0       8        5        0      active sync   /dev/sda5
>        1       8       21        1      active sync   /dev/sdb5
> 
> nemtemp:/ # mdadm -E /dev/sda5
> /dev/sda5:
>           Magic : a92b4efc
>         Version : 1.0
>     Feature Map : 0x1
>      Array UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
>            Name : 1
>   Creation Time : Thu Aug 21 06:43:22 2008
>      Raid Level : raid1
>    Raid Devices : 2
> 
>  Avail Dev Size : 41945504 (20.00 GiB 21.48 GB)
>      Array Size : 41945504 (20.00 GiB 21.48 GB)
>    Super Offset : 41945632 sectors
>           State : clean
>     Device UUID : e0c1c580:db4d853e:6fac1c8f:fb5399d7
> 
> Internal Bitmap : -81 sectors from superblock
>     Update Time : Tue Nov 19 15:28:38 2013
>        Checksum : d37d1086 - correct
>          Events : 148
> 
> 
>     Array Slot : 0 (0, 1)
>    Array State : Uu
> 
> nemtemp:/ # mdadm -E /dev/sdb5
> /dev/sdb5:
>           Magic : a92b4efc
>         Version : 1.0
>     Feature Map : 0x1
>      Array UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
>            Name : 1
>   Creation Time : Thu Aug 21 06:43:22 2008
>      Raid Level : raid1
>    Raid Devices : 2
> 
>  Avail Dev Size : 41945504 (20.00 GiB 21.48 GB)
>      Array Size : 41945504 (20.00 GiB 21.48 GB)
>    Super Offset : 41945632 sectors
>           State : active
>     Device UUID : 6edfa3f8:c8c4316d:66c19315:5eda0911
> 
> Internal Bitmap : -81 sectors from superblock
>     Update Time : Tue Nov 19 15:28:38 2013
>        Checksum : 39ef40a5 - correct
>          Events : 149
> 
> 
>     Array Slot : 1 (0, 1)
>    Array State : uU
> 
> 
> 

What version of mdadm do you have?  It looks like it should be cleverer than
it is.

What if you add "--update=no-bitmap" to the --assemble line?
As the bitmap seems to be causing problem, ignoring it might help.

NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux