On 13.02.2013 12:45, Sebastian Riemer wrote: > On 13.02.2013 03:38, NeilBrown wrote: >> diff --git a/drivers/md/md.c b/drivers/md/md.c >> index 8b557d2..292cc2f 100644 >> --- a/drivers/md/md.c >> +++ b/drivers/md/md.c >> @@ -6529,7 +6529,17 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode, >> mddev->ro = 0; >> sysfs_notify_dirent_safe(mddev->sysfs_state); >> set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); >> - md_wakeup_thread(mddev->thread); >> + /* mddev_unlock will wake thread */ >> + /* If a device failed while we were read-only, we >> + * need to make sure the metadata is updated now. >> + */ >> + if (test_bit(MD_CHANGE_DEVS, &mddev->flags)) { >> + mddev_unlock(mddev); >> + wait_event(mddev->sb_wait, >> + !test_bit(MD_CHANGE_DEVS, &mddev->flags) && >> + !test_bit(MD_CHANGE_PENDING, &mddev->flags)); >> + mddev_lock(mddev); >> + } >> } else { >> err = -EROFS; >> goto abort_unlock; >> > > Thanks, Neil! > > I can confirm the issue on 3.4.y and that your patch fixes it reliably. > > Acked-by: Sebastian Riemer <sebastian.riemer@xxxxxxxxxxxxxxxx> > Damn, I've got a kernel which still crashes in reap_sync_thread->raid1_spare_active() with NULL pointer dereference although this patch is applied. So the fix isn't correct, yet. I did some "objdump -S" on raid1.ko and found the issue at the following code location in raid1_spare_active(): # for (i = 0; i < conf->raid_disks; i++) { # struct md_rdev *rdev = conf->mirrors[i].rdev; # struct md_rdev *repl = conf->mirrors[conf->raid_disks + i].rdev; A resync was pending (create without --assume-clean). For me it looks like the faulty setting races with the syncer. The rdev isn't registered in the personality anymore but the syncer tries to access it for immediate resync. Cheers, Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html