Re: 4-disk RAID6 (non-standard layout) normalise hung, now all disks spare

Phil Turmel <philip@xxxxxxxxxx> · Sat, 26 Jun 2021 10:28:13 -0400

Good morning Jason, Wol,

On 6/26/21 9:13 AM, antlists wrote:
On 26/06/2021 12:09, Jason Flood wrote:
     Reshape Status : 99% complete
      Delta Devices : -1, (5->4)
         New Layout : left-symmetric

               Name : Universe:0
               UUID : 3eee8746:8a3bf425:afb9b538:daa61b29
             Events : 184255

     Number   Major   Minor   RaidDevice State
        6       8       16        0      active sync   /dev/sdb
        7       8       32        1      active sync   /dev/sdc
        5       8       48        2      active sync   /dev/sdd
        4       8       64        3      active sync   /dev/sde

Phil will know much more about this than me, but I did notice that the 
system thinks there should be FIVE raid drives. Is that an mdadm bug?

Not a bug, but a reshape from a degraded array with a reduction in space.

That would explain the failure to assemble - it thinks there's a drive 
missing. And while I don't think we've had data-eating trouble, 
reshaping a parity raid has caused quite a lot of grief for people over 
the years ...

I've never tried it starting from a degraded array.  Might be a corner 
case bug not yet exposed.

However, you're running a recent Ubuntu and mdadm - that should all have 
been fixed by now.

Indeed.

Cheers,
Wol

On 6/26/21 7:09 AM, Jason Flood wrote:
> Thanks for that, Phil - I think I'm starting to piece it all together 
now. I was going from a 4-disk RAID5 to 4-disk RAID6, so from my reading 
the backup file was recommended. The non-standard layout meant that the 
array had over 20TB usable, but standardising the layout reduced that to 
16TB. In that case the reshape starts at the end so the critical section 
(and so the backup file) may have been in progress at the 99% complete 
point when it failed, hence the need to specify the backup file for the 
assemble command.
>
> I ran "sudo mdadm --assemble --verbose --force /dev/md0 /dev/sd[bcde] 
--backup-file=/root/raid5backup":
>
> mdadm: looking for devices for /dev/md0
> mdadm: /dev/sdb is identified as a member of /dev/md0, slot 0.
> mdadm: /dev/sdc is identified as a member of /dev/md0, slot 1.
> mdadm: /dev/sdd is identified as a member of /dev/md0, slot 2.
> mdadm: /dev/sde is identified as a member of /dev/md0, slot 3.
> mdadm: Marking array /dev/md0 as 'clean'
> mdadm: /dev/md0 has an active reshape - checking if critical section 
needs to be restored
> mdadm: No backup metadata on /root/raid5backup
> mdadm: added /dev/sdc to /dev/md0 as 1
> mdadm: added /dev/sdd to /dev/md0 as 2
> mdadm: added /dev/sde to /dev/md0 as 3
> mdadm: no uptodate device for slot 4 of /dev/md0
> mdadm: added /dev/sdb to /dev/md0 as 0
> mdadm: Need to backup 3072K of critical section..
> mdadm: /dev/md0 has been started with 4 drives (out of 5).
>

So force was sufficient to assemble.  But you are still stuck at 99%.

Look at the output of ps to see if mdmon is still running (that is the 
background process that actually reshapes stripe by stripe).  If not, 
look in your logs for clues as to why it died.

If you can't find anything significant, the next step would be to backup 
the currently functioning array to another system/drive collection and 
start from scratch.  I wouldn't trust anything else with the information 
available.

Phil

ps.  Convention on kernel.org mailing lists is to NOT top-post, and to 
trim unnecessary context.