Re: Help, array corrupted after clean shutdown.

Oliver Schinagl <oliver+list@xxxxxxxxxxx> · Sat, 06 Apr 2013 20:01:49 +0200

On 04/06/13 19:44, Durval Menezes wrote:
Hi Oliver,

Seems most of your problems are filesystem corruption (the extN family
is well known for lack of robustness).

I would try to mount the filesystem read-only (without fsck) and copy
off as much data as possible... Then fsck and try to copy the rest.

Good luck.
It fails to mount ;)

How can I ensure that the array is not corrupt however (while degraded)? 
At least that way, I can try my luck with ext4 tools.

--
    Durval.

On Apr 6, 2013 12:13 PM, "Oliver Schinagl" <oliver+list@xxxxxxxxxxx
<mailto:oliver%2Blist@xxxxxxxxxxx>> wrote:

    On 04/06/13 17:06, Durval Menezes wrote:

        Oliver,

        What file system? LVM or direct on the MD device?

    Sorry, should have mentioned this.

    I have 4 1.5 TB sata drives, connected to the onboard sata controller.

    I have made 1 GPT partition ontop of each drive and then made a
    raid5 array ontop of those devices:

    md101 : active (read-only) raid5 sdd1[0] sde1[4] sdf1[1]
           4395413760 blocks super 1.2 level 5, 256k chunk, algorithm 2
    [4/3] [UU_U]

    I then formatted /dev/md101 with ext4.

    Tune2fs still happily runs on /dev/md101, but of course that doesn't
    mean anything.

    riley tmp # tune2fs -l /dev/md101
    tune2fs 1.42 (29-Nov-2011)
    Filesystem volume name:   data01
    Last mounted on:          /tank/01
    Filesystem UUID:          9c812d61-96ce-4b71-9763-__b77e8b9618d1
    Filesystem magic number:  0xEF53
    Filesystem revision #:    1 (dynamic)
    Filesystem features:      has_journal ext_attr resize_inode
    dir_index filetype extent flex_bg sparse_super large_file huge_file
    uninit_bg dir_nlink extra_isize
    Filesystem flags:         signed_directory_hash
    Default mount options:    (none)
    Filesystem state:         not clean
    Errors behavior:          Continue
    Filesystem OS type:       Linux
    Inode count:              274718720
    Block count:              1098853440
    Reserved block count:     0
    Free blocks:              228693396
    Free inodes:              274387775
    First block:              0
    Block size:               4096
    Fragment size:            4096
    Reserved GDT blocks:      762
    Blocks per group:         32768
    Fragments per group:      32768
    Inodes per group:         8192
    Inode blocks per group:   512
    RAID stride:              64
    RAID stripe width:        192
    Flex block group size:    16
    Filesystem created:       Wed Apr 28 16:42:58 2010
    Last mount time:          Tue May  4 17:14:48 2010
    Last write time:          Sat Apr  6 11:45:57 2013
    Mount count:              10
    Maximum mount count:      32
    Last checked:             Wed Apr 28 16:42:58 2010
    Check interval:           15552000 (6 months)
    Next check after:         Mon Oct 25 16:42:58 2010
    Lifetime writes:          3591 GB
    Reserved blocks uid:      0 (user root)
    Reserved blocks gid:      0 (group root)
    First inode:              11
    Inode size:               256
    Required extra isize:     28
    Desired extra isize:      28
    Journal inode:            8
    First orphan inode:       17
    Default directory hash:   half_md4
    Directory Hash Seed:      f1248a94-5a6a-4e4a-af8a-__68b019d13ef6
    Journal backup:           inode blocks

        --
             Durval.

        On Apr 6, 2013 8:23 AM, "Oliver Schinagl"
        <oliver+list@xxxxxxxxxxx <mailto:oliver%2Blist@xxxxxxxxxxx>
        <mailto:oliver%2Blist@__schinagl.nl
        <mailto:oliver%252Blist@xxxxxxxxxxx>>> wrote:

             Hi,

             I've had a powerfailure today, to which my UPS responded
        nicely and
             made my server shutdown normally. One would expect
        everything is
             well, right? The array, as far as I know, was operating without
             problems before the shutdown, all 4 devices where normally
        online.
             mdadm sends me an e-mail if something is wrong, so does
        smartctl.

             First thing I noticed that I had 2 (S) drives for
        /dev/md101. I thus
             started examining things. First I thought that it was some
        mdadm
             weirdness, where it failed to assemble the drive with all
        components.
             mdadm -A /dev/md101 /dev/sd[cdef]1 failed and gave the same
        result.
             Something was really wrong.

             I checked and compared the output of mdadm --examine on all
        drives
             (like -Evvvs below) and found that /dev/sdc1's events count
        was wrong.
             /dev/sdf1 and /dev/sdd1 matched (and later sde1 too, but
        more on
             that in a sec). So sdc1 may have been dropped from the
        array without
             me knowing it, unlikely put possible. The odd thing is the huge
             difference in event counts, but all four are marked as ACTIVE.

             So then onto sde1; why was it failing on that. The gpt
        table was
             completly gone. 00000. Gone. I used hexdump to examine the
        drive
             further, and at 0x00041000 there was the mdraid table, as
        one would
             expect. Good, so it looks like only the gpt has been wiped
        for some
             misterious reason. Re-creating the gpt quickly revealed mdadm's
             information was still correct (as can be seen below).

             So ignore sdc1 and assemble the drive as is should be fine?
        Right? No.
             mdadm -A /dev/md101 /dev/sd[def]1 worked without error.

             I always do a fsck before and after a reboot (unless of
        course I
             can't do the shutdown fsck) and verify /proc/mdadm after a
        boot. So
             before mounting, as always, I tried to run fsck /dev/md101
        -C -; but
             that came up with tons of errors. I didn't fix anything and
        aborted.

             And here we are now. I can't just copy the entire disk
        (1.5TB per
             disk) and 'experiment', I don't have 4 spare disks. The
        first thing
             I would want to try is is mdadm -A /dev/sd[cdf]1 --force
        (leave out
             the possibly corrupted sde1) and see what that does.

             All that said when I did the assemble with the 'guessed' 3
        correct
             drives. Did of course increase the events count. sdc1 of course
             didn't partake in this. Assuming that it is in sync with
        the rest,
             what is the worst that can happen? And does the --read-only
        flag
             protect against it?

             Linux riley 3.7.4-gentoo #2 SMP Tue Feb 5 16:20:59 CET 2013
        x86_64
             AMD Phenom(tm) II X4 905e Processor AuthenticAMD GNU/Linux

             riley tmp # mdadm --version
             mdadm - v3.1.4 - 31st August 2010

             riley tmp # mdadm -Evvvvs
             /dev/sdf1:
                        Magic : a92b4efc
                      Version : 1.2
                  Feature Map : 0x0
                   Array UUID : 2becc012:2d317133:2447784c:____1aab300d
                         Name : riley:data01  (local to host riley)
                Creation Time : Tue Apr 27 18:03:37 2010
                   Raid Level : raid5
                 Raid Devices : 4

               Avail Dev Size : 2930276351 (1397.26 GiB 1500.30 GB)
                   Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
                Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
                  Data Offset : 272 sectors
                 Super Offset : 8 sectors
                        State : clean
                  Device UUID : 97877935:04c16c5f:0746cb98:____63bffb4c

                  Update Time : Sat Apr  6 11:46:03 2013
                     Checksum : b585717a - correct
                       Events : 512993

                       Layout : left-symmetric
                   Chunk Size : 256K

                 Device Role : Active device 1
                 Array State : AA.A ('A' == active, '.' == missing)
             mdadm: No md superblock detected on /dev/sdf.
             /dev/sde1:
                        Magic : a92b4efc
                      Version : 1.2
                  Feature Map : 0x0
                   Array UUID : 2becc012:2d317133:2447784c:____1aab300d
                         Name : riley:data01  (local to host riley)
                Creation Time : Tue Apr 27 18:03:37 2010
                   Raid Level : raid5
                 Raid Devices : 4

               Avail Dev Size : 2930275847 (1397.26 GiB 1500.30 GB)
                   Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
                Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
                  Data Offset : 776 sectors
                 Super Offset : 8 sectors
                        State : clean
                  Device UUID : 3f48d5a8:e3ee47a1:23c8b895:____addd3dd0

                  Update Time : Sat Apr  6 11:46:03 2013
                     Checksum : eaec006b - correct
                       Events : 512993

                       Layout : left-symmetric
                   Chunk Size : 256K

                 Device Role : Active device 3
                 Array State : AA.A ('A' == active, '.' == missing)
             mdadm: No md superblock detected on /dev/sde.
             /dev/sdd1:
                        Magic : a92b4efc
                      Version : 1.2
                  Feature Map : 0x0
                   Array UUID : 2becc012:2d317133:2447784c:____1aab300d
                         Name : riley:data01  (local to host riley)
                Creation Time : Tue Apr 27 18:03:37 2010
                   Raid Level : raid5
                 Raid Devices : 4

               Avail Dev Size : 2930276351 (1397.26 GiB 1500.30 GB)
                   Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
                Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
                  Data Offset : 272 sectors
                 Super Offset : 8 sectors
                        State : clean
                  Device UUID : 236f6c48:2a1bcf6b:a7d7d861:____53950637

                  Update Time : Sat Apr  6 11:46:03 2013
                     Checksum : 87f31abb - correct
                       Events : 512993

                       Layout : left-symmetric
                   Chunk Size : 256K

                 Device Role : Active device 0
                 Array State : AA.A ('A' == active, '.' == missing)
             mdadm: No md superblock detected on /dev/sdd.
             /dev/sdc1:
                        Magic : a92b4efc
                      Version : 1.2
                  Feature Map : 0x0
                   Array UUID : 2becc012:2d317133:2447784c:____1aab300d
                         Name : riley:data01  (local to host riley)
                Creation Time : Tue Apr 27 18:03:37 2010
                   Raid Level : raid5
                 Raid Devices : 4

               Avail Dev Size : 2930276351 (1397.26 GiB 1500.30 GB)
                   Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
                Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
                  Data Offset : 272 sectors
                 Super Offset : 8 sectors
                        State : active
                  Device UUID : 3ce8e262:ad864aee:9055af9b:____6cbfd47f

                  Update Time : Sat Mar 16 20:20:47 2013
                     Checksum : a7686a57 - correct
                       Events : 180132

                       Layout : left-symmetric
                   Chunk Size : 256K

                 Device Role : Active device 2
                 Array State : AAAA ('A' == active, '.' == missing)
             mdadm: No md superblock detected on /dev/sdc.

             Before I assembled the array for the first time (mdadm -A
        /dev/md101
             /dev/sdd1 /dev/sde1 /dev/sdf1), this is how it looked like:
             So identical to the above, wtih the exception of the number
        of events.

             riley tmp # mdadm --examine /dev/sde1
             /dev/sde1:
                        Magic : a92b4efc
                      Version : 1.2
                  Feature Map : 0x0
                   Array UUID : 2becc012:2d317133:2447784c:____1aab300d
                         Name : riley:data01  (local to host riley)
                Creation Time : Tue Apr 27 18:03:37 2010
                   Raid Level : raid5
                 Raid Devices : 4

               Avail Dev Size : 2930275847 (1397.26 GiB 1500.30 GB)
                   Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
                Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
                  Data Offset : 776 sectors
                 Super Offset : 8 sectors
                        State : clean
                  Device UUID : 3f48d5a8:e3ee47a1:23c8b895:____addd3dd0

                  Update Time : Sat Apr  6 09:44:30 2013
                     Checksum : eaebe3ea - correct
                       Events : 512989

                       Layout : left-symmetric
                   Chunk Size : 256K

                 Device Role : Active device 3
                 Array State : AA.A ('A' == active, '.' == missing)

             --
             To unsubscribe from this list: send the line "unsubscribe
        linux-raid" in
             the body of a message to majordomo@xxxxxxxxxxxxxxx
        <mailto:majordomo@xxxxxxxxxxxxxxx>
             <mailto:majordomo@vger.kernel.__org
        <mailto:majordomo@xxxxxxxxxxxxxxx>>
             More majordomo info at
        http://vger.kernel.org/____majordomo-info.html
        <http://vger.kernel.org/__majordomo-info.html>
             <http://vger.kernel.org/__majordomo-info.html
        <http://vger.kernel.org/majordomo-info.html>>

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html