> On Oct 15, 2019, at 8:48 PM, Yufen Yu <yuyufen@xxxxxxxxxx> wrote: > >>> drivers/md/md.c | 51 +++++++++++++++++++++++++++++-------------------- >>> 1 file changed, 30 insertions(+), 21 deletions(-) >>> >>> diff --git a/drivers/md/md.c b/drivers/md/md.c >>> index 1be7abeb24fd..1be1deca3e3a 100644 >>> --- a/drivers/md/md.c >>> +++ b/drivers/md/md.c >>> @@ -1097,7 +1097,7 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor >>> { >>> char b[BDEVNAME_SIZE], b2[BDEVNAME_SIZE]; >>> mdp_super_t *sb; >>> - int ret; >>> + int ret = 0; >>> >>> /* >>> * Calculate the position of the superblock (512byte sectors), >>> @@ -1111,14 +1111,12 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor >>> if (ret) >>> return ret; >>> >>> - ret = -EINVAL; >>> - >> I think ret is handled correctly in existing code. I would not recommend this >> type of refactoring. > > Yes, there is no problem for existing code. But, after adding follow test: > > + if (sb->disks[rdev->desc_nr].state & ( > + (1<<MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE))) > + if (ev1 > ev2) > + ret = 1; > > > If the current disk is spare or events is smaller, then ret will be set as default '-EINVAL', > which is not expected. Just as super_1_load(), I refactor the return. > > If you don't like it, I can modify the test as following without refactoring: > > + if (sb->disks[rdev->desc_nr].state & ( > + (1<<MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE)) && > + (ev1 > ev2)) > ret = 1; > else > ret = 0; I like this better. > >> >>> bdevname(rdev->bdev, b); >>> sb = page_address(rdev->sb_page); >>> >>> if (sb->md_magic != MD_SB_MAGIC) { >>> pr_warn("md: invalid raid superblock magic on %s\n", b); >>> - goto abort; >>> + return -EINVAL; >>> } >>> >>> if (sb->major_version != 0 || >>> @@ -1126,15 +1124,15 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor >>> sb->minor_version > 91) { >>> pr_warn("Bad version number %d.%d on %s\n", >>> sb->major_version, sb->minor_version, b); >>> - goto abort; >>> + return -EINVAL; >>> } >>> >>> if (sb->raid_disks <= 0) >>> - goto abort; >>> + return -EINVAL; >>> >>> if (md_csum_fold(calc_sb_csum(sb)) != md_csum_fold(sb->sb_csum)) { >>> pr_warn("md: invalid superblock checksum on %s\n", b); >>> - goto abort; >>> + return -EINVAL; >>> } >>> >>> rdev->preferred_minor = sb->md_minor; >>> @@ -1156,19 +1154,24 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor >>> if (!md_uuid_equal(refsb, sb)) { >>> pr_warn("md: %s has different UUID to %s\n", >>> b, bdevname(refdev->bdev,b2)); >>> - goto abort; >>> + return -EINVAL; >>> } >>> if (!md_sb_equal(refsb, sb)) { >>> pr_warn("md: %s has same UUID but different superblock to %s\n", >>> b, bdevname(refdev->bdev, b2)); >>> - goto abort; >>> + return -EINVAL; >>> } >>> ev1 = md_event(sb); >>> ev2 = md_event(refsb); >>> - if (ev1 > ev2) >>> - ret = 1; >>> - else >>> - ret = 0; >>> + >>> + /* >>> + * Insist on good event counter while assembling, except >>> + * for spares (which don't need an event count) >>> + */ >>> + if (sb->disks[rdev->desc_nr].state & ( >>> + (1<<MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE))) >>> + if (ev1 > ev2) >>> + ret = 1; >> Just realized this: >> >> If the first device being passed to load_super is a spare device, we still >> have the same problem, no? > > Good catch! > > My local test with one spare disk and one normal disk shows that, > no matter what order of two disks in 'mdadm -A', spare disk cannot > be the first device in mddev->disks list. I think mdadm tool have > ordered all disks by events. So, test case will be OK. > > But, I think we need to resolve the problem completely. Yes, let's resolve this in kernel. Thanks, Song