> I have a problem about raid5. I created a raid5 [ 3+1 2TB > with 128KiB chunks ... ] parallel write 150 files to the > array, the speed of each is 1MB/s. Your problem with RAID5 is that even writing at just 1MB/s per file, writing 150 streams in parallel can mean a lot of arm movement, and a RAID5 of just 4 consumer-class drives probably can't deliver that many IOPS, unless your writes are pretty large (e.g. every stream writes 1MB in one go every second). It probably will mostly work, but with pretty tight margins. Both because 4 disks are not that many for 150 even relatively slow streams, and RAID5 writes are correlated because of having to do full stripe writes (in your case 384KiB at least) to avoid RMW. > Unfortunately, the electricity went off suddenly at the > time. when I turn on the device again, I found the raid5 is in > recovery. When the progress of the recovery went up to 98%, > there was a write error occurred. [ ... ] That's a bad situation for those disks, because a *write* error means that there are no more spare sectors available in the whole disk, because on a write the disk firmware on finding a bad sector can always transparently substitute it, if there are spare sectors available. That there are no spare sectors available means that the firmware previously found a lot of bad sectors. >From this point onwards you no longer have a RAID issue, the MD RAID has attempted its rebuild after finding the drives out of sync, and it is now purely a hardware issue. It is bit offtopic, but let's go over it without too many details, making obligatory references to MD RAID aspects where appropriate. The first one is a vastly misunderstood point about base RAID systems like MD: they are not supposed to detect errors, they are deliberately designed under the assumption that any and every storage issue is discovered and reported to MD by the block device layer and beneath. So for example the purpose of parity is *reconstruction* of data, once the block device layer has reported a device issue, not the *detection* of corrupted data. It can also be used as an aside for that, but a numver of optimizations in parity RAID depend on not using parity to detect issues. > used “HDD_Regenerator” to check if there were bad blocks in > the disks. The result of the output indicated that sda and sdb > did have a bad sector. They have many, but many/most are remapped to spares. The number will be in the SMART attribute 'Reallocated_Sector_Ct'. The output is that they have at least one *unspared* bad sector. > These disks were used for the first time after purchased. Is > it normal to have bad sectors? It is quite normal to have bad sectors: a 2TB drive has 4 billion 512B sectors, or 0.5 billion 4KiB sectors, and *some* percentage of that very large number must be defective. MD note: since a small number (hundreds) usually flips to bad over time, it is convenient to use MD sync-checking to *detect* issues. Since this is an aside convenience, it must be used explicitly (and if used it is usually VERY IMPORTANT to ensure that SMART ERC is set for a short timeout). Also it may be useful to run periodic SMART selftests. But both MD sync-checking and SMART selftests consume IOPS and bandwidth. But in your case probably the sudden power loss, perhaps accompanied by a power surge, may have damaged in some way or another, depending on the disk mechanics, electronics and firmware, some significant chunk of the recording surface. > Could you please help me? If you want to use 'sda' and 'sdb' for production systems with any degree of criticality I would say don't do it. If you are them purely for testing I would suggest some steps that *might* make them more useful again: * If available run SECURITY ERASE on the drives using recent versions of 'hdparm'. Many drive firmwares seem to combine SECURITY ERASE with refreshing and rebuilding the spared and spare sector lists. * Map the areas where there are unspared sectors using 'badblocks' or 'dd_rescue', and then partition the disks (using GPT labelling) and create not-to-use partitions on those areas. You may have even 10% or more of the disk in bad sectors, but as long as the partition(s) you actually use don't cross a bad area, it is relatively safe to use them. Some older filesystems can be given bad-sector lists and will not use them, but with RAID5 that becomes a bit complicated. Note that often drives with many *spared* sectors can perform badly because the spare sectors that subtitute for bad one can be rather far away from them, causing sudden long seeks. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html