Re: RAID6 12 device assemble force failure

Mariusz Tkaczyk <mariusz.tkaczyk@xxxxxxxxxxxxxxx> · Thu, 4 Jul 2024 13:06:10 +0200

On Wed, 3 Jul 2024 23:10:52 +0200
Adam Niescierowicz <adam.niescierowicz@xxxxxxxxxx> wrote:

> On 3.07.2024 o 12:16, Mariusz Tkaczyk wrote:
> > On Wed, 3 Jul 2024 09:42:53 +0200
> > Mariusz Tkaczyk<mariusz.tkaczyk@xxxxxxxxxxxxxxx>  wrote:
> >
> > I was able to achieve similar state:
> >
> > mdadm -E /dev/nvme2n1
> > /dev/nvme2n1:
> >            Magic : a92b4efc
> >          Version : 1.2
> >      Feature Map : 0x0
> >       Array UUID : 8fd2cf1a:65a58b8d:0c9a9e2e:4684fb88
> >             Name : gklab-localhost:my_r6  (local to host gklab-localhost)
> >    Creation Time : Wed Jul  3 09:43:32 2024
> >       Raid Level : raid6
> >     Raid Devices : 4
> >
> >   Avail Dev Size : 1953260976 sectors (931.39 GiB 1000.07 GB)
> >       Array Size : 10485760 KiB (10.00 GiB 10.74 GB)
> >    Used Dev Size : 10485760 sectors (5.00 GiB 5.37 GB)
> >      Data Offset : 264192 sectors
> >     Super Offset : 8 sectors
> >     Unused Space : before=264112 sectors, after=1942775216 sectors
> >            State : clean
> >      Device UUID : b26bef3c:51813f3f:e0f1a194:c96c4367
> >
> >      Update Time : Wed Jul  3 11:49:34 2024
> >    Bad Block Log : 512 entries available at offset 16 sectors
> >         Checksum : a96eaa64 - correct
> >           Events : 6
> >
> >           Layout : left-symmetric
> >       Chunk Size : 512K
> >
> >     Device Role : Active device 2
> >     Array State : ..A. ('A' == active, '.' == missing, 'R' == replacing)
> >
> >
> > In my case, events value was different and /dev/nvme3n1 had different Array
> > State:
> >   Device Role : Active device 3
> >     Array State : ..AA ('A' == active, '.' == missing, 'R' == replacing)  
> 
> This kind of array behavior is like it should be?
> 
> Why I'm asking, in theory.
> 
> We have bitmap so when third drive disconnect from the array we should 
> have time to stop the array in foulty state before bitmap space is over, 
> yes?.
> Next array send notification to FS(filesystem) that there is a problem 
> and will discard all write operation.

At the moment when failure of 3th disk is recorded then array is moved to
BROKEN state which means that every new write is automatically failed. Only
reads are allowed.
Thus makes no possibility for bitmap to be fully occupied (no bitmap update on
read path), we aborting earlier than bitmap is updated for new write if array
is broken.

No notification to filesystem is sent but filesystem may discover it after
receiving error on every write.

> 
> Data that can't be store on the foulty device should be keep in the bitmap.
> Next when we reatach missing third drive when we write missing data from 
> bitmap to disk everything should be good, yes?
> 
> I'm thinking correctly?
> 

Bitmap doesn't record writes. Please read:
https://man7.org/linux/man-pages/man4/md.4.html
bitmap is used to optimize resync and recovery in case of re-add (but we
know that it won't work in your case). 

> 
> > And I failed to start it, sorry. It is possible but it requires to work with
> > sysfs and ioctls directly so much safer is to recreate an array with
> > --assume-clean, especially that it is fresh array.  
> 
> I recreated the array, LVM detected PV and works fine but XFS above the 
> LVM is missing data from recreate array.
> 

Well, it looks like you did it right because LVM is up. Please compare if disks
are ordered same way in new array (indexes of the drives in mdadm -D output).
Just do be double sure.

For the filesystem issues- I cannot help on that, I hope you can recover at
least part of data :(

Thanks,
Mariusz