On Wed, 27 Jul 2011 15:06:10 +0200 Lutz Vieweg <lvml@xxxxxx> wrote: > On 07/27/2011 02:44 PM, John Robinson wrote: > >> Can you describe the criteria for MD considering a block as faulty? > > > > I'll try to answer this having followed some of the discussion around it. > > Thanks a lot for the explanation! Yes John, thanks for posting. > > > Once the controller or power issues are resolved, the bad block list can be > > administratively modified or cleared. > > Ah, that's good. "administratively" probably isn't the right word. You cannot tell md to remove blocks from the list (except for testing purposes). When md finds that it might be good to write to a known-bad-block it has two options - to write or not. It makes the choice based on whether it has seen any write errors on that device since the array was assembled. If it has - it just doesn't write and leaves the block 'bad'. If it has not it tries to write. On success it clears the record of the bad block. On failure it decides not to write to and more bad blocks on that device. So if you have a device that is incorrectly reporting errors and filling up the bad block list, and you then stop the array, fix the hardware, and re-assemble, then the bad blocks will gradually disappear as writes try to write to them again and succeed. A 'check' pass should automatically fix everything up as it tries to re-write bad blocks. > > > I don't think mdadm knows whether its constituent devices are SSDs. > > In block/cfq-iosched.c I see a test that looks like this: > > if (blk_queue_nonrot(cfqd->queue) && cfqd->hw_tag) > > return; > > If that isn't conclusive, putting a note into the mdadm man-page is probably > the best one can do. > The idea of marking a device as 'rotational' always seemed dumb to me. Because people assume that 'rotational' is a disk drive and '!rotational' is an SSD. But what if some other technology comes along with behaviour somewhere between the two?? I think the primary meaning of 'rotational' as implemented is 'seek is instant'. This is quite a different meaning to 'blocks migrate around the device' even though both are true of current SSDs. I'm not sure that md can usefully do anything different on SSDs than on spinning rust. You certainly still want to record read errors. If you get a write error it probably means that a large part of the device is bad ... but I suspect you will notice that soon enough anyway. NeilBrown > Regards, > > Lutz Vieweg > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html