Hi Xiao, On Tue, 21 Dec 2021 09:40:50 +0800 Xiao Ni <xni@xxxxxxxxxx> wrote: > Now for a raid0, it can't remove one member disk from raid0. It > returns EBUSY and the raid0 still can work well. It makes sense. > Because all member disks are busy, the admin can't remove the member > disk and mdadm gives a proper error. EBUSY means that drive is busy but it is not. Just drive cannot be safety removed. As I wrote in patch 2: If "faulty" was not set then -EBUSY was returned to userspace. It causes that mdadm expects -EBUSY if the array becomes failed. There are some reasons to not consider this mechanism as correct: - drive can't be failed for different reasons. - there are path where -EBUSY is not reported and drive removal leads to failed array, without notification for userspace. - in the array failure case -EBUSY seems to be wrong status. Array is not busy, but removal process cannot proceed safe. For compatibility reasons i cannot remove EBUSY. I left more detailed explanation in patch 2. > With this patch, it changes the situation. In raid0_error, it sets > MD_BROKEN. In fact, it isn't broken. So is it really good to set > MD_BROKEN here? In patch 62f7b1989c0 ("md raid0/linear: Mark array as > 'broken'...), MD_BROKEN is introduced > when the member disk disappears and the disk is really broken. For > raid0/linear, the raid device can't work anymore. It is broken, any md_error() call should end with appropriate action: - drive removal (if possible) - failing array (if cannot degrade array) We cannot trust drive if md_error() was called, so writes will be blocked. IMO it is reasonable- to not engage level stack, because one or more members cannot be trusted (even if it is still accessible). Just allow to read old data (if still possible). Thanks, Mariusz