On 9/20/18 9:36 PM, Gi-Oh Kim wrote:
On Wed, Sep 19, 2018 at 5:09 PM Gi-Oh Kim <gi-oh.kim@xxxxxxxxxxxxxxxx> wrote:
Hi,
I found a weird behavior of re-adding a device.
I think it is a kernel bug.
I would appreciate it if somebody can confirm if it is a bug or feature.
I tested re-adding a device as following.
1. create md with ram0 and ram1
2. add ram2
3. grow raid-device number to 3
4. remove ram2
5. grow raid-device number to 2
6. add ram2
7. ram0 become faulty and ram2 become active
8. stop md
9. assemble md with ram0 and ram1 => fail because ram0 is faulty
Hi,
I checked the kernel function raid1_spare_active() in raid1.c and
found out ram0 is set as faulty on purpose.
If ram0 is set as fauly to replace it with ram2, i think it should be
successful to assemble ram1 and ram2.
But "mdadm -A /dev/md111 /dev/ram1 /dev/ram2" creates md111 with only ram2.
I do not understand why it is necessary to set ram0 faulty.
When ram2 is add back to array, seems array prefers to use the previous
role
which is recorded by saved_raid_disk.
How can I re-add ram2 device as the spare device without setting ram0 faulty?
I guess you can achieve the goal by remove the superblock of ram2
(before step 6).
BTW, could you try the below change? I think it can fix the issue.
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 4e990246225e..1d54109071cc 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1734,6 +1734,7 @@ static int raid1_add_disk(struct mddev *mddev,
struct md_rdev *rdev)
*/
if (rdev->saved_raid_disk >= 0 &&
rdev->saved_raid_disk >= first &&
+ rdev->saved_raid_disk < conf->raid_disks &&
conf->mirrors[rdev->saved_raid_disk].rdev == NULL)
first = last = rdev->saved_raid_disk;
Thanks,
Guoqing