Start reshape failed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Neil

When I add one disk to a 3disks raid5 and try to run --grow to let raid devices to 4. 
It's failed to start reshape. I used 4.3.0-rc2 and the latest mdadm git tree.

[root@storageqe-09 md]# mdadm --grow /dev/md0 --raid-devices=4
mdadm: Failed to initiate reshape!
[root@storageqe-09 md]# uname -r
4.3.0-rc2
[root@storageqe-09 md]# mdadm --version
mdadm - v3.3.4-24-g86a406c - 28th September 2015

After some analysis, I found some hints.

A: when run --grow, it want to write reshape to sync_action. Before that, it set SET_ARRAY_INFO
in impose_reshape first. When set SET_ARRAY_INFO, it get mutex lock mddev->reconfig_mutex, and 
will call mddev_resume. In mddev_resume it set MD_RECOVERY_NEEDED.

B: At the same time raid5d run. And it call md_check_recovery. But it can't get the lock 
mddev->reconfig_mutex. So it misses the chance to clear MD_RECOVERY_NEEDED.

After A, it write reshape to sync_action. It calls action_store. It will return EBUSY because
the MD_RECOVERY_NEEDED is already set.

} else if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) ||
    test_bit(MD_RECOVERY_NEEDED, &mddev->recovery)) 
     return -EBUSY;

It's a little complex. I add md_check_recovery in action_store and the problem can be fixed.
But I think maybe it's not a right way to fix this. 

Could you give some suggestions?

Best Regards
Xiao
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux