Daniel Santos wrote: > I retried rebuilding the array once again from scratch, and this time > checked the syslog messages. The reconstructions process is getting > stuck at a disk block that it can't read. I double checked the block > number by repeating the array creation, and did a bad block scan. No bad > blocks were found. How could the md driver be stuck if the block is fine ? > > Supposing that the disk has bad blocks, can I have a raid device on > disks that have badblocks ? Each one of the disks is 400 GB. > > Probably not a good idea because if a drive has bad blocks it probably > will have more in the future. But anyway, can I ? > The bad blocks would have to be known to the md driver. Well, almost all modern drives can remap bad blocks (at least I know no drive that can't). Most of the time it happens on write - becaue if such a bad block is found during read operation and the drive really can't read the content of that block, it can't remap it either without losing data. From my expirience (about 20 years, many 100s of drives, mostly (old) SCSI but (old) IDE too), it's pretty normal for a drive to develop several bad blocks, especially during first year of usage. Sometimes however, number of bad blocks grows quite rapidly and such a drive definietely should be replaced - at least Seagate drives are covered by warranty in this case. SCSI drives has 2 so-called "defect lists", stored somewhere inside the drive - factory-preset list (bad blocks found during internal testing when producing a drive), and grown list (bad blocks found by drive during normal usage). Factory-preset list can contain from 0 to about 1000 entries or even more (depending on the size too), grown list can be as large as 500 blocks or more, whenever it's fatal or not depends on whenever new bad blocks continues to be found or not. We have several drives which developed that many bad blocks in first few months of usage, the list stopped growing, and they're still working just fine for >5 years. Both defect lists can be shown by scsitools programs. I don't know how one can see defect lists on a IDE or SATA drive. Note that md layer (raid1, 4, 5, 6, 10 - but obviously not raid0 and linear) are now able to repair bad blocks automatically, by forcing write to the same place of the drive where a read error occured - this usually forces drive to automatically reallocate that block and continue. But in any case, md should not stall - be it during reconstruction or not. For this, I can't comment - to me it smells like a bug somewhere (md layer? error handling in driver? something else?) which should be found and fixed. And for this, some more details are needed I guess -- kernel version is a start. /mjt - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html