On 21/08/2019 13:14, Song Liu wrote: > [...] > > What do you mean by "not clear MD_BROKEN"? Do you mean we need to restart > the array? > > IOW, the following won't work: > > mdadm --fail /dev/md0 /dev/sdx > mdadm --remove /dev/md0 /dev/sdx > mdadm --add /dev/md0 /dev/sdx > > And we need the following instead: > > mdadm --fail /dev/md0 /dev/sdx > mdadm --remove /dev/md0 /dev/sdx > mdadm --stop /dev/md0 /dev/sdx > mdadm --add /dev/md0 /dev/sdx > mdadm --run /dev/md0 /dev/sdx > > Thanks, > Song > Song, I've tried the first procedure (without the --stop) and failed to make it work on linear/raid0 arrays, even trying in vanilla kernel. What I could do is: 1) Mount an array and while writing, remove a member (nvme1n1 in my case); "mdadm --detail md0" will either show 'clean' state or 'broken' if we have my patch; 2) Unmount the array and run: "mdadm -If nvme1n1 --path pci-0000:00:08.0-nvme-1" This will result: "mdadm: set device faulty failed for nvme1n1: Device or resource busy" Despite the error, md0 device is gone. 3) echo 1 > /sys/bus/pci/rescan [nvme1 device is back] 4) mdadm -A --scan [md0 is back, with both devices and 'clean' state] So, either if we "--stop" or if we incremental fail a member of the array, when it's back the state will be 'clean' and not 'broken'. Hence, I don't see a point in clearing the MD_BROKEN flag for raid0/linear arrays, nor I see where we could do it. And thanks for the link for your tree, I'll certainly rebase my patch against that. Cheers, Guilherme