Re: linux mdadm assembly error: md: cannot handle concurrent replacement and reshape. (reboot while reshaping)

Peter Neuwirth <reddunur@xxxxxxxxx> · Tue, 2 May 2023 13:30:12 +0200

Hello Kuai,

thank you for your suggestion!
It is true, as I read the source of error message in drivers/md/raid5.c,
I saw that growing and replacement is to much to handle.
So I did what you suggested and started the raid 5 (that was in a
raid 6 transformation with addition of two more drives) with only the
5 members, that should run a degraded raid 5.

mdadm --assemble --run   /dev/md0 /dev/sdd /dev/sdc /dev/sdb /dev/sdi /dev/sdj

this worked and it was assembled.

Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active (auto-read-only) raid6 sdd[0] sdi[6] sdj[4] sdb[2] sdc[1]
     4883151360 blocks super 1.2 level 6, 256k chunk, algorithm 18 [7/5] [UUU_UU_]
     bitmap: 0/8 pages [0KB], 65536KB chunk

unused devices: <none>

mdadm --detail /dev/md0
/dev/md0:
          Version : 1.2
    Creation Time : Mon Mar  6 18:17:30 2023
       Raid Level : raid6
       Array Size : 4883151360 (4656.94 GiB 5000.35 GB)
    Used Dev Size : 976630272 (931.39 GiB 1000.07 GB)
     Raid Devices : 7
    Total Devices : 5
      Persistence : Superblock is persistent

    Intent Bitmap : Internal

      Update Time : Fri Apr 28 04:21:03 2023
            State : clean, degraded
   Active Devices : 5
  Working Devices : 5
   Failed Devices : 0
    Spare Devices : 0

           Layout : left-symmetric-6
       Chunk Size : 256K

Consistency Policy : bitmap

       New Layout : left-symmetric

             Name : solidsrv11:0  (local to host solidsrv11)
             UUID : 1a87479e:7513dd65:37c61ca1:43184f65
           Events : 6336

   Number   Major   Minor   RaidDevice State
      0       8       48        0      active sync   /dev/sdd
      1       8       32        1      active sync   /dev/sdc
      2       8       16        2      active sync   /dev/sdb
      -       0        0        3      removed
      4       8      144        4      active sync   /dev/sdj
      6       8      128        5      active sync   /dev/sdi
      -       0        0        6      removed

But when I try to mount it as xfs fs:

mount: /mnt/image: mount(2) system call failed: Structure needs cleaning.

When I try to repair the xfs fs, it tells me, that there was no superblock
found..

xfs_repair -n /dev/md0
Phase 1 - find and verify superblock...
couldn't verify primary superblock - not enough secondary superblocks with matching geometry !!!

attempting to find secondary superblock...
.................found candidate secondary superblock...
unable to verify superblock, continuing...
.found candidate secondary superblock...
unable to verify superblock, continuing...

...

.found candidate secondary superblock...
unable to verify superblock, continuing...
.found candidate secondary superblock...
unable to verify superblock, continuing...
...........................................

Sadly I do not exactly understand, what happens in the grow+replacement phase,
where all evil begun. As I could see, the two added hard disk drives still have their old
partition table, so I suppose, the rebuild process was still in moving the raid 5 geometry
to a raid-5-to-6 transient geometry. I'm not sure if in this process, raid 5 promise (1 drive
may fail) still holds. However, the two additional drives were treated as spare since
this moment after reboot. And one drive of the prior riad5, now raid6 seems to be defect.

Is it possible that the process restart somehow scrambled some raidset informations and
messed up my raid level striping in continued growth process ? then the still mounted device
crashed and disappeared from mounts. And from this point on, there was no way to reconstruct
the messed raidset informations and striping?

This whole matter with striped data, transient raid geometries, expansion and growth
processing, etc. seems so complex and intransparent to me, that I start to consider
my data on this raidset as lost :(

For any tools and suggestions helping to save at least parts of the data on the
raid, I would be very happy.

regards,

Peter

Am 28.04.23 um 04:01 schrieb Yu Kuai:
Hi,

在 2023/04/28 5:09, Peter Neuwirth 写道:

------------------------------------------------------------------------------------------------------------------------
Some Logs:
------------------------------------------------------------------------------------------------------------------------

uname -a ; mdadm --version
Linux srv11 5.10.0-21-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21) x86_64 GNU/Linux
mdadm - v4.1 - 2018-10-01

srv11:~# mdadm -D /dev/md0
/dev/md0:
            Version : 1.2
      Creation Time : Mon Mar  6 18:17:30 2023
         Raid Level : raid6
      Used Dev Size : 976630272 (931.39 GiB 1000.07 GB)
       Raid Devices : 7
      Total Devices : 6
        Persistence : Superblock is persistent

        Update Time : Thu Apr 27 17:36:15 2023
              State : active, FAILED, Not Started
     Active Devices : 5
    Working Devices : 6
     Failed Devices : 0
      Spare Devices : 1

             Layout : left-symmetric-6
         Chunk Size : 256K

Consistency Policy : unknown

         New Layout : left-symmetric

               Name : solidsrv11:0  (local to host solidsrv11)
               UUID : 1a87479e:7513dd65:37c61ca1:43184f65
             Events : 4700

     Number   Major   Minor   RaidDevice State
        -       0        0        0      removed
        -       0        0        1      removed
        -       0        0        2      removed
        -       0        0        3      removed
        -       0        0        4      removed
        -       0        0        5      removed
        -       0        0        6      removed

        -       8       32        2      sync   /dev/sdc
        -       8      144        4      sync   /dev/sdj
        -       8       80        0      sync   /dev/sdf
        -       8       16        1      sync   /dev/sdb
        -       8      128        5      sync   /dev/sdi
        -       8       96        4      spare rebuilding /dev/sdg

Looks like the /dev/sdg is not the original device, above log shows that
RaidDevice 3 is missing, and /dev/sdg is replacement of /dev/sdj.

So reshapge is still in progress, and somehow sdg is the replacement of
sdj, this matches the condition in raid5_run:

7952                 if (rcu_access_pointer(conf->disks[i].replacement) &&
7953                     conf->reshape_progress != MaxSector) {
7954                         /* replacements and reshape simply do not mix. */
7955                         pr_warn("md: cannot handle concurrent replacement and reshape.\n");
7956                         goto abort;
7957                 }

I'm by no means raid5 expert but I will suggest to remove /dev/sdg and
try again to assemble.

Thanks,
Kuai