This sounds great! But... 2/ Do you intend to create a user space program to attempt to correct the bad block and put the device back in the array automatically? I hope so. If not, please consider correcting the bad block without kicking the device out. Reason: Once the device is kicked out, a second bad block on another device is fatal to the array. And this has been happening a lot lately. 3/ Maybe don't do the bad block scan if the array is degraded. Reason: If a bad block is found, that would kick out a second disk, which is fatal. Since the stated purpose of this is to "check parity/copies are correct" then you probably can't do this anyway. I just want to be sure. Also, if during the scan, if a device is kicked, the scan should pause or abort. The scan can resume once the array has been corrected. I would be happy if the scan had to be restarted from the start. So a pause or abort is fine with me. thanks for your time, Guy -----Original Message----- From: linux-raid-owner@xxxxxxxxxxxxxxx [mailto:linux-raid-owner@xxxxxxxxxxxxxxx] On Behalf Of Neil Brown Sent: Monday, November 15, 2004 5:27 PM To: Guy Watkins Cc: linux-raid@xxxxxxxxxxxxxxx Subject: Re: Bad blocks are killing us! On Monday November 15, guy@xxxxxxxxxxxxxxxx wrote: > Neil, > This is a private email. You can post it if you want. snip > > Anyway, in the past there have been threads about correcting bad > blocks automatically within md. I think a RAID1 patch was created that will > attempt to correct a bad block automatically. Is it likely that you will > pursue this for RAID5 and maybe RAID6? I hope so. My current plans for md are: 1/ incorporate the "bitmap resync" patches that have been floating around for some months. This involves a reasonable amount of work as I want them to work with raid5/6/10 as well as raid1. raid10 is particularly interesting as resync is quite different from recovery there. 2/ Look at recovering from failed reads that can be fixed by a write. I am considering leveraging the "bitmap resync" stuff for this. With the bitmap stuff in place, you can let the kernel kick out a drive that has a read error, let user-space have a quick look at the drive and see if it might be a recoverable error, and then give the drive back to the kernel. It will then do a partial resync based on the bitmap information, thus writing the bad blocks, and all should be fine. This would mean re-writing several megabytes instead of a few sectors, but I don't think that is a big cost. There are a few issues that make it a bit less trivial than that, but it will probably be my starting point. The new "faulty" personality will allow this to be tested easily. 3/ Look at background data scans - i.e. read the whole array and check that parity/copies are correct. This will be triggered and monitored by user-space. If a read error happens during the scan, we trip the recovery code discussed above. While these are my current intentions, there are no guarantees and definitely no time frame. I get to spend about 50%-60% of my time on this at the moment, so there is hope. > About RAID6, you have fixed a bug or 2 in the last few weeks. Would > you consider RAID6 stable (safe) yet? I'm not really in a position to answer that. The code is structurally very similar to raid5, so there is a good chance that there are no races or awkward edge cases (unless there still are some in raid5). The "parity" arithmetic has been extensively tested out of the kernel and seems to be reliable. Basic testing seems to show that it largely works, but I haven't done more than very basic testing myself. So it is probably fairly close to stable. What it really needs is lots of testing. Build a filesystem on a raid6 and then in a loop: mount / do metadata-intensive stress test / umount / fsck -f while that is happening, fail, remove, and re-add various drives. Try to cover all combinations of failing active drives and spaces-being-rebuilt while 0, 1, or 2 drives are missing. Try using a "faulty" device and causing it to fail as well as just "mdadm --set-faulty". If you cannot get it to fail, you will have increased your confidence of it's safety. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html