Re: Problems with raid after reboot.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 22, 2011 at 9:12 PM, Matthew Tice <mjtice@xxxxxxxxx> wrote:
> On Wed, Jul 20, 2011 at 3:24 PM, Matthew Tice <mjtice@xxxxxxxxx> wrote:
>> Hello,
>>
>> I had to shutdown my machine for moving - when I powered it back up my
>> raid-5 array is in a bad state:
>>
>> # mdadm -A -s
>> mdadm: no devices found for /dev/md0
>> mdadm: /dev/md/0 assembled from 2 drives - not enough to start the array.
>>
>> I ended up forcing the assembly:
>>
>> # mdadm -A -s -f
>> mdadm: no devices found for /dev/md0
>> mdadm: forcing event count in /dev/sde(1) from 177 upto 181
>> mdadm: clearing FAULTY flag for device 1 in /dev/md/0 for /dev/sde
>> mdadm: /dev/md/0 has been started with 3 drives (out of 4).
>>
>> Looking at the detailed output the missing disk (/dev/sdc) is "removed":
>>
>>
>> # mdadm --detail /dev/md0
>> /dev/md0:
>>        Version : 00.90
>>  Creation Time : Sat Mar 12 21:22:34 2011
>>     Raid Level : raid5
>>     Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
>>  Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
>>   Raid Devices : 4
>>  Total Devices : 3
>> Preferred Minor : 0
>>    Persistence : Superblock is persistent
>>
>>    Update Time : Tue Jul 19 20:44:45 2011
>>          State : clean, degraded
>>  Active Devices : 3
>> Working Devices : 3
>>  Failed Devices : 0
>>  Spare Devices : 0
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>           UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host storage)
>>         Events : 0.181
>>
>>    Number   Major   Minor   RaidDevice State
>>       0       8       80        0      active sync   /dev/sdf
>>       1       8       64        1      active sync   /dev/sde
>>       2       8       48        2      active sync   /dev/sdd
>>       3       0        0        3      removed
>>
>>
>> I can examine the disk but I'm unable to add it (I don't recall if it
>> needs to be removed first or not):
>>
>> # mdadm --examine /dev/sdc
>> /dev/sdc:
>>          Magic : a92b4efc
>>        Version : 00.90.00
>>           UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host storage)
>>  Creation Time : Sat Mar 12 21:22:34 2011
>>     Raid Level : raid5
>>  Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
>>     Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
>>   Raid Devices : 4
>>  Total Devices : 3
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jul 19 20:44:45 2011
>>          State : clean
>>  Active Devices : 3
>> Working Devices : 3
>>  Failed Devices : 1
>>  Spare Devices : 0
>>       Checksum : 22ac1229 - correct
>>         Events : 181
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     3       8       32        3      active sync   /dev/sdc
>>
>>   0     0       8       80        0      active sync   /dev/sdf
>>   1     1       0        0        1      faulty removed
>>   2     2       8       48        2      active sync   /dev/sdd
>>   3     3       8       32        3      active sync   /dev/sdc
>>
>> # mdadm --add /dev/md0 /dev/sdc
>> mdadm: Cannot open /dev/sdc: Device or resource busy
>>
>> So a couple questions.
>>
>> 1. Any thoughts on what would cause this?  I seem to have bad luck
>> with my raid arrays whenever I reboot.
>> 2. How do I fix?  Everything *seems* to be as it should be . . .
>>
>> Here is the mdadm.conf:
>>
>> # cat /etc/mdadm/mdadm.conf  | grep -v ^#
>>
>> DEVICE partitions
>> CREATE owner=root group=disk mode=0660 auto=yes
>> HOMEHOST <system>
>> MAILADDR root
>> ARRAY /dev/md0 level=raid5 num-devices=4
>> UUID=11c1cdd8:60ec9a90:2e29483d:f114274d
>>
>> Any help is greatly appreciated.
>>
>> Matt
>>
>
> One thing I just noticed that seems kind of strange is that when I do
> the --examine /dev/sdc it shows the drive in the list and another is
> listed as faulty?  Am I reading that right?
>
>
> # mdadm --examine /dev/sdc
> /dev/sdc:
>          Magic : a92b4efc
>        Version : 00.90.00
>           UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host storage)
>  Creation Time : Sat Mar 12 21:22:34 2011
>     Raid Level : raid5
>  Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
>     Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
>   Raid Devices : 4
>  Total Devices : 3
> Preferred Minor : 0
>
>    Update Time : Tue Jul 19 20:44:45 2011
>          State : clean
>  Active Devices : 3
> Working Devices : 3
>  Failed Devices : 1
>  Spare Devices : 0
>       Checksum : 22ac1229 - correct
>         Events : 181
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>      Number   Major   Minor   RaidDevice State
> this     3       8       32        3      active sync   /dev/sdc
>
>   0     0       8       80        0      active sync   /dev/sdf
>   1     1       0        0        1      faulty removed
>   2     2       8       48        2      active sync   /dev/sdd
>   3     3       8       32        3      active sync   /dev/sdc
>

Well things are a lot different now - I'm unable to start the array
successfully.  I removed an older non-relevant drive that was giving
me smart errors - when I rebooted the drive assignments shifted (not
sure this really matters, though).

Now when I try to start the array I get:

# mdadm -A -f /dev/md0
mdadm: no devices found for /dev/md0

I can nudge it slightly with auto-detect:

# mdadm --auto-detect

Then I try to assemble the array with:

# mdadm -A -f /dev/md0 /dev/sd[bcde]
mdadm: cannot open device /dev/sde: Device or resource busy
mdadm: /dev/sde has no superblock - assembly aborted

I ran Phil's lsdrv script (and an mdadm --examine /dev/sd[bcde] so
hopefully it can help:

# lsdrv

PCI [ata_piix] 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7
Family) IDE Controller (rev 01)
 ├─scsi 0:0:0:0 LITE-ON COMBO SOHC-4836K {2006061700044437}
 │  └─sr0: [11:0] Empty/Unknown 1.00g
 └─scsi 1:x:x:x [Empty]
PCI [ata_piix] 00:1f.2 IDE interface: Intel Corporation N10/ICH7
Family SATA IDE Controller (rev 01)
 ├─scsi 2:x:x:x [Empty]
 └─scsi 3:0:0:0 ATA HDS728080PLA380 {PFDB20S4SNLT6J}
    └─sda: [8:0] Partitioned (dos) 76.69g
       ├─sda1: [8:1] (ext4) 75.23g {960433b3-af56-41bd-bb9a-d0a0fb5ffb45}
       │  └─Mounted as
/dev/disk/by-uuid/960433b3-af56-41bd-bb9a-d0a0fb5ffb45 @ /
       ├─sda2: [8:2] Partitioned (dos) 1.00k
       └─sda5: [8:5] (swap) 1.46g {10c3b226-16d4-44ea-ad1e-6296bb92969d}
PCI [sata_sil24] 04:00.0 RAID bus controller: Silicon Image, Inc. SiI
3132 Serial ATA Raid II Controller (rev 01)
 ├─scsi 4:0:0:0 ATA WDC WD7500AADS-0 {WD-WCAV59574584}
 │  └─sdb: [8:16] MD raid5 (4) 698.64g inactive
{daf06d5a-b805-28b1-2e29-483df114274d}
 ├─scsi 4:1:0:0 ATA WDC WD7500AADS-0 {WD-WCAV59459025}
 │  └─sdc: [8:32] MD raid5 (4) 698.64g inactive
{daf06d5a-b805-28b1-2e29-483df114274d}
 ├─scsi 4:2:0:0 ATA Hitachi HDS72101 {JP9911HZ1SKHNU}
 │  └─sdd: [8:48] MD raid5 (4) 931.51g inactive
{daf06d5a-b805-28b1-2e29-483df114274d}
 ├─scsi 4:3:0:0 ATA Hitachi HDS72101 {JP9960HZ1VK96U}
 │  └─sde: [8:64] MD raid5 (none/4) 931.51g md_d0 inactive spare
{daf06d5a-b805-28b1-2e29-483df114274d}
 │     └─md_d0: [254:0] Empty/Unknown 0.00k
 └─scsi 7:x:x:x [Empty]
PCI [pata_via] 02:00.0 IDE interface: VIA Technologies, Inc. PATA IDE
Host Controller
 ├─scsi 5:x:x:x [Empty]
 └─scsi 6:x:x:x [Empty]
PCI [sata_sil24] 05:01.0 RAID bus controller: Silicon Image, Inc. SiI
3124 PCI-X Serial ATA Controller (rev 02)
 ├─scsi 8:x:x:x [Empty]
 ├─scsi 9:x:x:x [Empty]
 ├─scsi 10:x:x:x [Empty]
 └─scsi 11:x:x:x [Empty]
Other Block Devices
 ├─ram0: [1:0] Empty/Unknown 64.00m
 ├─ram1: [1:1] Empty/Unknown 64.00m
 ├─ram2: [1:2] Empty/Unknown 64.00m
 ├─ram3: [1:3] Empty/Unknown 64.00m
 ├─ram4: [1:4] Empty/Unknown 64.00m
 ├─ram5: [1:5] Empty/Unknown 64.00m
 ├─ram6: [1:6] Empty/Unknown 64.00m
 ├─ram7: [1:7] Empty/Unknown 64.00m
 ├─ram8: [1:8] Empty/Unknown 64.00m
 ├─ram9: [1:9] Empty/Unknown 64.00m
 ├─ram10: [1:10] Empty/Unknown 64.00m
 ├─ram11: [1:11] Empty/Unknown 64.00m
 ├─ram12: [1:12] Empty/Unknown 64.00m
 ├─ram13: [1:13] Empty/Unknown 64.00m
 ├─ram14: [1:14] Empty/Unknown 64.00m
 └─ram15: [1:15] Empty/Unknown 64.00m


# mdadm --examine /dev/sd[bcde]

/dev/sdb:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host storage)
  Creation Time : Sat Mar 12 21:22:34 2011
     Raid Level : raid5
  Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
     Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
   Raid Devices : 4
  Total Devices : 3
Preferred Minor : 0

    Update Time : Mon Jul 25 14:08:30 2011
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 22b3c880 - correct
         Events : 5593

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       64        3      active sync   /dev/sde

   0     0       0        0        0      removed
   1     1       8       32        1      active sync   /dev/sdc
   2     2       8       16        2      active sync   /dev/sdb
   3     3       8       64        3      active sync   /dev/sde
/dev/sdc:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host storage)
  Creation Time : Sat Mar 12 21:22:34 2011
     Raid Level : raid5
  Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
     Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
   Raid Devices : 4
  Total Devices : 3
Preferred Minor : 0

    Update Time : Mon Jul 25 14:08:30 2011
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 22b3c84e - correct
         Events : 5593

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       16        2      active sync   /dev/sdb

   0     0       0        0        0      removed
   1     1       8       32        1      active sync   /dev/sdc
   2     2       8       16        2      active sync   /dev/sdb
   3     3       8       64        3      active sync   /dev/sde
/dev/sdd:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host storage)
  Creation Time : Sat Mar 12 21:22:34 2011
     Raid Level : raid5
  Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
     Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
   Raid Devices : 4
  Total Devices : 3
Preferred Minor : 0

    Update Time : Mon Jul 25 14:08:30 2011
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 22b3c85c - correct
         Events : 5593

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       32        1      active sync   /dev/sdc

   0     0       0        0        0      removed
   1     1       8       32        1      active sync   /dev/sdc
   2     2       8       16        2      active sync   /dev/sdb
   3     3       8       64        3      active sync   /dev/sde
/dev/sde:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host storage)
  Creation Time : Sat Mar 12 21:22:34 2011
     Raid Level : raid5
  Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
     Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
   Raid Devices : 4
  Total Devices : 3
Preferred Minor : 0

    Update Time : Sun Jul 24 11:46:10 2011
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 22b255fc - correct
         Events : 5591

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       80        0      active sync

   0     0       8       80        0      active sync
   1     1       8       64        1      active sync   /dev/sde
   2     2       8       48        2      active sync   /dev/sdd
   3     3       0        0        3      faulty removed

I've looked but I'm unable to find where the drive is in use.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux