failed drive?

dustin kempter <dustink@xxxxxxxxxxxxxxxxxxx> · Thu, 13 Nov 2014 13:18:06 -0700



    Hi all, I'm having some issues. I'm a little confused, so I was
    checking our servers today and saw something strange. cat
    /proc/mdstat shows that 1 device md0 is inactive. I'm not really
    sure why. id did a bit more digging and testing using smartctl and
    it says that the device /dev/sdg (part of md0) is failing, estimated
    to fail within 24 hrs. but if i do df -h it doesn't even show md0,
    and was talking to a friend and we disagreed. I believe that based
    on what smrtctl says the drive is failing but not failed yet. he
    doesn't think its a problem with the drive. do you have any thoughts
    on this? and why would the device (md0) suddenly be inactive but
    still show 2 working devices (sdg, sdh)?

    
    (proc/mdstat)

    [root@csdatastandby3 bin]# cat /proc/mdstat 

    Personalities : [raid1] [raid10] 

    md125 : active raid10 sdf1[5] sdc1[2] sde1[4] sda1[0] sdb1[1]
    sdd1[3]

          11720655360 blocks super 1.2 512K chunks 2 near-copies [6/6]
    [UUUUUU]

          
    md126 : active raid1 sdg[1] sdh[0]

          463992832 blocks super external:/md0/0 [2/2] [UU]

          
    md0 : inactive sdh[1](S) sdg[0](S)

          6306 blocks super external:imsm

           
    unused devices: <none>

    [root@csdatastandby3 bin]#

    
    (smartctl)

    [root@csdatastandby3 bin]# smartctl -H /dev/sdg 

    smartctl 5.43 2012-06-30 r3573
    [x86_64-linux-2.6.32-431.17.1.el6.x86_64] (local build)

    Copyright (C) 2002-12 by Bruce Allen,
    http://smartmontools.sourceforge.net

    
    === START OF READ SMART DATA SECTION ===

    SMART overall-health self-assessment test result: FAILED!

    Drive failure expected in less than 24 hours. SAVE ALL DATA.

    Failed Attributes:

    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE     
    UPDATED  WHEN_FAILED RAW_VALUE

      5 Reallocated_Sector_Ct   0x0033   002   002   036    Pre-fail 
    Always   FAILING_NOW 32288

    
    [root@csdatastandby3 bin]# 

    
    (df -h)

    [root@csdatastandby3 bin]# df -h

    Filesystem      Size  Used Avail Use% Mounted on

    /dev/md126p4    404G  4.3G  379G   2% /

    tmpfs            16G  172K   16G   1% /dev/shm

    /dev/md126p2    936M   74M  815M   9% /boot

    /dev/md126p1    350M  272K  350M   1% /boot/efi

    /dev/md125       11T  4.2T  6.1T  41% /data

    [root@csdatastandby3 bin]# 

    
    (mdadm -D /dev/md0

    [root@csdatastandby3 bin]# mdadm -D /dev/md0

    /dev/md0:

            Version : imsm

         Raid Level : container

      Total Devices : 2

    
    Working Devices : 2

    
               UUID : 32c1fbb7:4479296b:53c02d9b:666a08f6

      Member Arrays : /dev/md/Volume0

    
        Number   Major   Minor   RaidDevice

    
           0       8       96        -        /dev/sdg

           1       8      112        -        /dev/sdh

    [root@csdatastandby3 bin]# 

    
    thanks

    
    -dustink

  
-- 
users mailing list
users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org