RAID5 starts degraded after reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

I have created a 3 drive RAID5 array, but after each reboot it starts
degraded with only 1 drive. It also changes the UUID of that drive. I have
to manualy stop the array and force create it. I've seen similar problems in
the archives but still have not found solution.

Here are my details:

My machine runs Debian 4.0 with 2.6.18-5-686 #1 SMP kernel. I have 4 drive
controllers. One onboard IDE, one SCSI and two identical two port SATA
controllers. The root is on /dev/sda1 SCSI drive. Drives /dev/sdb, /dev/sdc
and /dev/sdd are used for the RAID5 array.

# fdisk -l

Disk /dev/sda: 9139 MB, 9139200000 bytes
255 heads, 63 sectors/track, 1111 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1         985     7911981   83  Linux
/dev/sda2             986        1111     1012095   82  Linux swap / Solaris

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1       60801   488384001   fd  Linux raid
autodetect

Disk /dev/sdc: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1       60801   488384032   fd  Linux raid
autodetect

Disk /dev/sdd: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1       60801   488384001   fd  Linux raid
autodetect

Disk /dev/hdc: 300.0 GB, 300069052416 bytes
255 heads, 63 sectors/track, 36481 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/hdc1               1       36481   293033601   83  Linux

Disk /dev/md0: 1000.2 GB, 1000210300928 bytes
2 heads, 4 sectors/track, 244191968 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md0 doesn't contain a valid partition table



# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Tue Sep 18 08:58:19 2007
     Raid Level : raid5
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
    Device Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Tue Sep 18 09:35:29 2007
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 72591f57:6d449586:dc3c5ed4:95c964aa
         Events : 0.2

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1
       2       8       49        2      active sync   /dev/sdd1


The file /etc/mdadm/mdadm.conf has these two lines:

DEVICES /dev/sdb1 /dev/sdc1 /dev/sdd1
ARRAY /dev/md0 level=raid5 num-devices=3
UUID=72591f57:6d449586:dc3c5ed4:95c964aa
devices=/dev/sdb1,/dev/sdc1,/dev/sdd1

After reboot I see this:

# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Tue Sep 18 08:58:19 2007
     Raid Level : raid5
    Device Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 3
  Total Devices : 1
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Tue Sep 18 16:30:01 2007
          State : active, degraded, Not Started
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 72591f57:6d449586:b90eefda:5c4888bb
         Events : 0.4

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       0        0        1      removed
       2       8       49        2      active sync   /dev/sdd1

# mdadm --examine /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 72591f57:6d449586:dc3c5ed4:95c964aa
  Creation Time : Tue Sep 18 08:58:19 2007
     Raid Level : raid5
    Device Size : 488383936 (465.76 GiB 500.11 GB)
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0

    Update Time : Tue Sep 18 16:30:01 2007
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0
       Checksum : a5cb4636 - correct
         Events : 0.4

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       17        0      active sync   /dev/sdb1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       49        2      active sync   /dev/sdd1

# mdadm --examine /dev/sdc1
/dev/sdc1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 72591f57:6d449586:dc3c5ed4:95c964aa
  Creation Time : Tue Sep 18 08:58:19 2007
     Raid Level : raid5
    Device Size : 488383936 (465.76 GiB 500.11 GB)
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0

    Update Time : Tue Sep 18 16:30:01 2007
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0
       Checksum : a5cb4648 - correct
         Events : 0.4

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       33        1      active sync   /dev/sdc1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       49        2      active sync   /dev/sdd1

So the md0 array was started only with partition /dev/sdd1 and the last 64
bits of the UUID were changed. The partitions /dev/sdb1 and /dev/sdd1 keep
the original UUID.

Here is part of the dmesg:

scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
        <Adaptec aic7895 Ultra SCSI adapter>
        aic7895C: Ultra Wide Channel A, SCSI Id=7, 32/253 SCBs

  Vendor: IBM       Model: DDRS-39130D       Rev: DC1B
  Type:   Direct-Access                      ANSI SCSI revision: 02
scsi0:A:6:0: Tagged Queuing enabled.  Depth 8
 target0:0:6: Beginning Domain Validation
 target0:0:6: wide asynchronous
 target0:0:6: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 8)
 target0:0:6: Domain Validation skipping write tests
 target0:0:6: Ending Domain Validation
ACPI: PCI Interrupt 0000:00:05.1[B] -> GSI 18 (level, low) -> IRQ 177
ahc_pci:0:5:1: Host Adapter Bios disabled.  Using default SCSI device
parameters
scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
        <Adaptec aic7895 Ultra SCSI adapter>
        aic7895C: Ultra Wide Channel B, SCSI Id=7, 32/253 SCBs

libata version 2.00 loaded.
sata_sil 0000:00:09.0: version 2.0
ACPI: PCI Interrupt 0000:00:09.0[A] -> GSI 17 (level, low) -> IRQ 185
ata1: SATA max UDMA/100 cmd 0xE083A080 ctl 0xE083A08A bmdma 0xE083A000 irq
185
ata2: SATA max UDMA/100 cmd 0xE083A0C0 ctl 0xE083A0CA bmdma 0xE083A008 irq
185
scsi2 : sata_sil
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: ATA-7, max UDMA/133, 976773168 sectors: LBA48 NCQ (depth 0/32)
ata1.00: ata1: dev 0 multi count 16
ata1.00: configured for UDMA/100
scsi3 : sata_sil
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata2.00: ATA-7, max UDMA/133, 976773168 sectors: LBA48 NCQ (depth 0/32)
ata2.00: ata2: dev 0 multi count 16
ata2.00: configured for UDMA/100
  Vendor: ATA       Model: WDC WD5000ABYS-0  Rev: 12.0
  Type:   Direct-Access                      ANSI SCSI revision: 05
  Vendor: ATA       Model: WDC WD5000ABYS-0  Rev: 12.0
  Type:   Direct-Access                      ANSI SCSI revision: 05
ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 16 (level, low) -> IRQ 193
ata3: SATA max UDMA/100 cmd 0xE085C480 ctl 0xE085C48A bmdma 0xE085C400 irq
193
ata4: SATA max UDMA/100 cmd 0xE085C4C0 ctl 0xE085C4CA bmdma 0xE085C408 irq
193
scsi4 : sata_sil
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata3.00: ATA-7, max UDMA/133, 976773168 sectors: LBA48 NCQ (depth 0/32)
ata3.00: ata3: dev 0 multi count 16
ata3.00: configured for UDMA/100
scsi5 : sata_sil
ata4: SATA link down (SStatus 0 SControl 310)
  Vendor: ATA       Model: ST3500630AS       Rev: 3.AA
  Type:   Direct-Access                      ANSI SCSI revision: 05
 target0:0:6: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 8)
SCSI device sda: 17850000 512-byte hdwr sectors (9139 MB)
sda: Write Protect is off
sda: Mode Sense: b9 00 00 08
SCSI device sda: drive cache: write back
SCSI device sda: 17850000 512-byte hdwr sectors (9139 MB)
sda: Write Protect is off
sda: Mode Sense: b9 00 00 08
SCSI device sda: drive cache: write back
 sda: sda1 sda2
sd 0:0:6:0: Attached scsi disk sda
SCSI device sdb: 976773168 512-byte hdwr sectors (500108 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 976773168 512-byte hdwr sectors (500108 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
 sdb:<6>usbcore: registered new driver usbfs
usbcore: registered new driver hub
 sdb1
sd 2:0:0:0: Attached scsi disk sdb
USB Universal Host Controller Interface driver v3.0
ACPI: PCI Interrupt 0000:00:04.2[D] -> GSI 19 (level, low) -> IRQ 201
uhci_hcd 0000:00:04.2: UHCI Host Controller
uhci_hcd 0000:00:04.2: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:04.2: irq 201, io base 0x00001840
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
SCSI device sdc: 976773168 512-byte hdwr sectors (500108 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: drive cache: write back
SCSI device sdc: 976773168 512-byte hdwr sectors (500108 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: drive cache: write back
 sdc:<6>Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
 sdc1
sd 3:0:0:0: Attached scsi disk sdc
SCSI device sdd: 976773168 512-byte hdwr sectors (500108 MB)
sdd: Write Protect is off
sdd: Mode Sense: 00 3a 00 00
SCSI device sdd: drive cache: write back
SCSI device sdd: 976773168 512-byte hdwr sectors (500108 MB)
sdd: Write Protect is off
sdd: Mode Sense: 00 3a 00 00
SCSI device sdd: drive cache: write back
 sdd: sdd1
sd 4:0:0:0: Attached scsi disk sdd
e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI
e100: Copyright(c) 1999-2005 Intel Corporation
PIIX4: IDE controller at PCI slot 0000:00:04.1
PIIX4: chipset revision 1
PIIX4: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0x1860-0x1867, BIOS settings: hda:DMA, hdb:DMA
    ide1: BM-DMA at 0x1868-0x186f, BIOS settings: hdc:pio, hdd:pio
Probing IDE interface ide0...
Probing IDE interface ide1...
hdc: ST3300622A, ATA DISK drive
ide1 at 0x170-0x177,0x376 on irq 15
ACPI: PCI Interrupt 0000:00:06.0[A] -> GSI 19 (level, low) -> IRQ 201
e100: eth0: e100_probe: addr 0xfc102000, irq 201, MAC addr 00:E0:18:C2:F2:B0
hdc: max request size: 512KiB
hdc: 586072368 sectors (300069 MB) w/16384KiB Cache, CHS=36481/255/63,
UDMA(33)
hdc: cache flushes supported
 hdc: hdc1
raid5: automatically using best checksumming function: pIII_sse
   pIII_sse  :   988.000 MB/sec
raid5: using function: pIII_sse (988.000 MB/sec)
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
raid6: int32x1    109 MB/s
raid6: int32x2    106 MB/s
raid6: int32x4    123 MB/s
raid6: int32x8    121 MB/s
raid6: mmxx1      284 MB/s
raid6: mmxx2      326 MB/s
raid6: sse1x1     245 MB/s
raid6: sse1x2     309 MB/s
raid6: using algorithm sse1x2 (309 MB/s)
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
md: md0 stopped.
md: bind<sdd1>
raid5: device sdd1 operational as raid disk 2
raid5: not enough operational devices for md0 (2/3 failed)
RAID5 conf printout:
 --- rd:3 wd:1 fd:2
 disk 2, o:1, dev:sdd1
raid5: failed to run raid set md0
md: pers->run() failed ...
Attempting manual resume


So now I have to stop the array and recreate it:

# mdadm --stop /dev/md0
mdadm: stopped /dev/md0

# mdadm --create /dev/md0 --assume-clean --level=5 --raid-devices=3
/dev/sdb1 /dev/sdc1 /dev/sdd1
mdadm: /dev/sdb1 appears to be part of a raid array:
    level=raid5 devices=3 ctime=Tue Sep 18 08:58:19 2007
mdadm: /dev/sdc1 appears to be part of a raid array:
    level=raid5 devices=3 ctime=Tue Sep 18 08:58:19 2007
mdadm: /dev/sdd1 appears to be part of a raid array:
    level=raid5 devices=3 ctime=Tue Sep 18 08:58:19 2007
Continue creating array? y
mdadm: array /dev/md0 started.

And it is working fine (with new UUID):

# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Tue Sep 18 16:42:09 2007
     Raid Level : raid5
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
    Device Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Tue Sep 18 16:42:09 2007
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : e3419c0a:59994dc9:bcdcb2cf:6e8e5621
         Events : 0.1

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1


Any suggestion why it starts with two failed disks after the reboot? Why it
changes the UUID?

Thanks for any help.

Tomas

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux