Re: Confused about device counting in MD RAID1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 22 Dec 2024 10:45:02 +0100
Tomas Mudrunka <mudrunka@xxxxxxxxx> wrote:

> Hello,
> i am working on implementation of MD RAID1 and i am bit lost
> regarding superblock 1.2 format. Can you please help with following?
> 
> I've created RAID1 like this:
> 
> DEVICE_COUNT = 1
> DEVICE_NUMBER = 0
> ROLES: 0x0000
> 
> mdadm reports it to be correct, used mdadm to grow it like this:
> 
> mdadm --grow /dev/md23 --raid-disks=2 --force
> maddm /dev/md23 --add /dev/sdb1
> 
> Now i've inspected superblocks of both devices and i have following:
> 
> DEVICE_COUNT = 2
> DEVICE_NUMBER = 0
> ROLES: 0x0000 0xFFFF 0x0100
> 
> DEVICE_COUNT = 1
> DEVICE_NUMBER = 2
> ROLES: 0x0000 0xFFFF 0x0100
> 
Hello Tomas,
What is the command you used to get this? I cannot match it with any
mdadm's output.
> 
> First device number is 0, why second device is 2 (while 1 being 
> skipped)? Should the count start at 1?

Blind guess is a "replacements" feature. I know, that MD/mdadm
allocates two slots for same device (n, n+1) to provide replacement
feature i.e. automatically replace old drive when recovery of new
one finished. So, you are still using old drive in this time but you are
preparing replacement drive in the same time.

The behavior may have logical explanation :)

> Why are there 3 roles now, when DEVICE_COUNT is 2 ? If count starts
> at 1, why would there be roles[0]?

First, please note that growing RAID1 to 2 drives is:
- adding one drive;
- extending metadata, new disk is "out-of-sync";
- performing "recovery" (not reshape!) for new disk.

0xFFFF is known to mean "faulty" (empty) in MD/mdadm. As I said, I
think this one to be slot for replacement. IMO it means that
replacement drive is missing. It is fine then.

roles[0] - DEVICE_NUMBER=0
roles[1] - DEVICE_NUMBER=1 (replacement slot for DEVICE_NUMBER=0)
roles[2] - DEVICE_NUMBER=2 (new drive)

> I am bit confused. Obviously i am making some trivial mistake and i 
> don't want to keep guessing anymore.
> Can you please tell me how to correctly handle this?

I'm still not sure where these properties comes from. I looked for
"DEVICE_" in all code with no success..

Anyway, I hope it is helpful. Good luck.

Thanks,
Mariusz




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux