Re: OT: silent data corruption reading from hard drives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



yes, that´s what i know too
the point is  develop a algorithm to make this in block level like a md device
it write to disk and check if it was write ok
it read from disk and check parity, if it´s wrong read again, after X
times reading wrong report it as a badblock (now a silent corruption
becomes a well know badblock disk corruption)

i think that´s all... just need someone to implement it, some ideas:

// create:
mdadm --create /dev/md0 --level=integrity /dev/sda1

// check:
echo "check" > /sys/block/md0/md/sync_action

// repair: (like harddisks badblock recovery, mark the block as
badblock, and it will never be used)
echo "repair" > /sys/block/md0/md/sync_action

when using it with raid1 raid10 or any other mirror mdadm device, the
device with silent corruption can be reported (checksum don´t match
data) as badblock (mdadm have news features to deal with badblocks...)
the disk with badblock will be ignored and read will be done in next
good disk...

we could use a md5 checksum (32bits?! - 4 bytes) and data with less
information, example today block is 512bytes.. change it to 508bytes
of data + 4 bytes of checksum

i think that´s all...


2012/8/2 Jeff Johnson <jeff.johnson@xxxxxxxxxxxxxxxxx>:
> The only ways I know of to currently detect/repair silent data corruption
> are via the use of T10-DIF on SAS drives with 520-byte sectors and embedded
> per block CRCs (bytes 513-520) or via a patented algorithm used in a
> commercial Linux software RAID product (www.streamscale.com).
>
> Neither approach is cost effective for small or personal use RAID
> applications.
>
>
> On 8/2/12 10:04 AM, Roberto Spadim wrote:
>>
>> well i think the integrity is know, but it愀 not fully needed since
>> the security isn愒 a problem we can buy secure sata/sas
>>
>> controlers/disks
>> the main problem will be in some days when we are using SoC systems
>> and we only have USB to connect a harddrive... maybe when this become
>> more popular we will see a development of a module to have data
>> integrity (silient corruption detection and maybe repair)
>>
>
> --
> ------------------------------
> Jeff Johnson
> Manager
> Aeon Computing
>
> jeff.johnson@xxxxxxxxxxxxxxxxx
> www.aeoncomputing.com
> t: 858-412-3810 x101   f: 858-412-3845
> m: 619-204-9061
>
> /* New Address */
> 4170 Morena Boulevard, Suite D - San Diego, CA 92117
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux