Re: RAID creation resync behaviors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/03/2017 10:04 PM, Shaohua Li wrote:
On Thu, May 04, 2017 at 11:07:01AM +1000, Neil Brown wrote:
On Wed, May 03 2017, Shaohua Li wrote:

Hi,

Currently we have different resync behaviors in array creation.

- raid1: copy data from disk 0 to disk 1 (overwrite)
- raid10: read both disks, compare and write if there is difference (compare-write)
- raid4/5: read first n-1 disks, calculate parity and then write parity to the last disk (overwrite)
- raid6: read all disks, calculate parity and compare, and write if there is difference (compare-write)

The approach taken for raid1 and raid4/5 provides the fastest sync for
an array built on uninitialised spinning devices.
RAID6 could use the same approach but would involve more CPU and so
the original author of the RAID6 code (hpa) chose to go for the low-CPU
cost option.  I don't know if tests were done, or if they would still be
valid on new hardware.
The raid10 approach comes from "it is too hard to optimize in general
because different RAID10 layouts have different trade-offs, so just
take the easy way out."

ok, thanks for the explanation!

Write whole disk is very unfriendly for SSD, because it reduces lifetime. And
if user already does a trim before creation, the unncessary write could make
SSD slower in the future. Could we prefer compare-write to overwrite if mdadm
detects the disks are SSD? Surely sometimes compare-write is slower than
overwrite, so maybe add new option in mdadm. An option to let mdadm trim SSD
before creation sounds reasonable too.

An option to ask mdadm to trim the data space and then --assume-clean
certainly sounds reasonable.

This doesn't work well. read returns 0 for trimmed data space in some SSDs, but
not all. If not, we will have trouble.

/sys/block/<device>/queue/discard_zeroes_data

We could use this as an indicator for what to do.

Jes
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux