On 04/26/2010 03:38 PM, Phillip Susi wrote: > On 4/26/2010 3:07 PM, Doug Ledford wrote: >> Then you need to remove the superblock from the device. > > Why? It has been removed. In English removed means it is no longer > part of the array. And in English raid means a hostile or predatory incursion, it has nothing to do with disc drives. And in English cat is an animal you pet. So technical jargon and regular English don't always agree, what's your point? > Elements which are not part of the array should not > be MADE part of the array just because they happen to be there. Sorry, but that's just not going to happen, ever. There any number of valid reasons why someone might want to temporarily remove a drive from an array and then readd it back later, and when they readd it back they want it to come back, and they want it to know that it used to be part of the array and only resync the necessary bits (if you have a write intent bitmap, otherwise it resyncs the whole thing). > Having to zero the superblock after failing and removing the drive is a > race condition with detecting the drive and automatically adding it back > to the array. No, it's not. The udev rules that add the drive don't race with manually removing it because they don't act on change events, only add events. > To properly remove the disk from the array the superblock > needs to be updated before the kernel releases the underlying device. Not going to happen. Doing what you request would undo a number of very useful features in the raid stack. So you might as well save your breath, we aren't going to make a remove event equivalent to a zero superblock event because then the entire --readd option would be rendered useless. >> The problem here seems to be an issue of expectations. You think that >> "removed" is used as a flag to record intent, where as it actually is >> nothing more than a matter of state. > > No, I don't think it has anything to do with intent. I think that the > state of being removed means it is no longer part of the array. It > sounds like your understanding of the state should be described in > English as detached or disconnected, rather than removed. Depends on context. Removed makes perfect sense from the point of view that the device has been removed from the list of devices currently held with an exclusive open by the md raid stack. >> Failed is also a matter of state. It means the device has encountered >> some sort of error and we should no longer attempt to send any >> read/write commands to the device. It is not a statement of *why* it's >> in that state. The removed state indicates that the device has been >> removed from the array and is no longer a slave to the array. Again, no >> indication of intent or cause, purely an issue of state. > > Yes, it does not indicate why, nor do we care. What we care about is > that the drive failed or was removed, so we should not be using it. Why > bother recording that fact in the superblock if you're just going to > ignore it the next time you start the array? Because there are both transient and permanent failures. Experience caused us to switch from treating all failures as permanent to treating failures as transient and picking up where we left off if at all possible because too many people were having a single transient failure render their array degraded, only to have a real issue come up sometime later that then meant the array was no longer degraded, but entirely dead. The job of the raid stack is to survive as much failure as possible before dying itself. We can't do that if we allow a single, transient event to cause us to stop using something entirely. Besides, what you seem to be forgetting is that those events that make us genuinely not want to use a device also make it so that at the next reboot the device generally isn't available or seen by the OS (controller failure, massive failure of the platter, etc). Simply failing and removing a device using mdadm mimics a transient failure. If you fail, remove, then zero-superblock then you mimic a permanent failure. There you go, you have a choice. If we were to do as you wish, then users would no longer have a choice, they would be forced into mimicking a hard failure only. I prefer to give users a choice on how they want to do things. So, just because you happen to think that the only way it *should* be done is like a hard failure doesn't mean we are going to change it to be that way. Things are the way they are for a reason, best to just learn to use --zero-superblock if that's what you want. -- Doug Ledford <dledford@xxxxxxxxxx> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband
Attachment:
signature.asc
Description: OpenPGP digital signature