Re: Spare disk not becoming active

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I don't mean to be rude, but it's been two weeks and my system is still in this state. Bump, anyone?

A thorough search of the web (before I originally posted this to the list) revealed nothing. No explanation as to why this occurs seemed apparent, only that it's happened a number of times. Most reports indicate that a complete stop of the array and reassemble fixes it, but I tried that and it still returned to spare. Some reports indicated my position but no response that seems complete.

Eventually the discussions runs to wiping the disks and starting again. That seems a bit drastic and I'm concerned that *one* of the disks is faulty but not being reported as such, and I don't want to pick the wrong one to wipe off the superblock. mdadm reports no errors, but SMART indicates there may be a problem with the *active* disk, which is even more worrying because without making the spare active I can't remove it to test it properly.

Any ideas?

Cheers,
Tudor.

On 03/12/12 11:04, Tudor Holton wrote:
Hallo,

I'm having some trouble with an array I have that has become degraded.

I have an array with this array state:

md101 : active raid1 sdf1[0] sdb1[2](S)
      1953511936 blocks [2/1] [U_]


mdadm --detail says:

/dev/md101:
        Version : 0.90
  Creation Time : Thu Jan 13 14:34:27 2011
     Raid Level : raid1
     Array Size : 1953511936 (1863.01 GiB 2000.40 GB)
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 101
    Persistence : Superblock is persistent

    Update Time : Fri Nov 23 03:23:04 2012
          State : clean, degraded
 Active Devices : 1
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 1

UUID : 43e92a79:90295495:0a76e71e:56c99031 (local to host barney)
         Events : 0.2127

    Number   Major   Minor   RaidDevice State
       0       8       81        0      active sync /dev/sdf1
       1       0        0        1      removed

       2       8       17        -      spare   /dev/sdb1


If I attempt to force the spare to become active it begins to recover:
$ sudo mdadm -S /dev/md101
mdadm: stopped /dev/md101
$ sudo mdadm --assemble --force --no-degraded /dev/md101 /dev/sdf1 /dev/sdb1
mdadm: /dev/md101 has been started with 1 drive (out of 2) and 1 spare.
$ cat /proc/mdstat
md101 : active raid1 sdf1[0] sdb1[2]
      1953511936 blocks [2/1] [U_]
[>....................] recovery = 0.0% (541440/1953511936) finish=420.8min speed=77348K/sec

This runs for the allotted time but returns to the state of spare.

Neither disk partition report errors:
$ cat /sys/block/md101/md/dev-sdf1/errors
0
$ cat /sys/block/md101/md/dev-sdb1/errors
0

Are there mdadm logs to find out why this is not recovering properly? How otherwise do I debug this?

Cheers,
Tudor.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux