Re: copy full system from old disk to a new one

Steve Ellis <ellis@xxxxxxxxxxxx> · Tue, 19 Feb 2013 22:22:41 -0800

On Tue, Feb 19, 2013 at 3:52 PM, Gordan Bobic <gordan@xxxxxxxxxx> wrote:

On 02/19/2013 10:00 PM, Reindl Harald wrote:

No, my experience does not go as far back 6 years for obvious reasons. My exprience with mechanical disks, however, goes as far back as 25 years, and I can promise you, they are every bit as unreliable as you fear the SSDs might be.

So, my experience with mechanical disks dates back 25 years as well (my first was a 5.25" HH 20M in a PC I bought in 1986), but I've had more frightening experiences with SSDs (and yet I still use them) than I have with conventional drives.  I've had 3 complete and total failures of name-brand SSD (all from the same vendor, unfortunately) within the course of 1 year, all drives were less than 1 year old, and were deployed in fairly conventional desktop machines--one was a warranty replacement of an earlier failure.  I've had unpleasant experiences with conventional disks as well, but I don't believe I've ever had more than one conventional drive fail so completely that _no_ data could be recovered--all of my SSD failures were like that. 

data without a raid are useless

My point was that even RAID is next to useless because it doesn't protect you against bit-rot.
As we all know, both conventional drives (and I believe SSDs) use extensive error detection/correction so that the drive will know if a block is unreliable (most of the time the drive will manage to remap that block elsewhere before it becomes unrecoverable)--individual drives only _very_ rarely manage to return the wrong data (I'm actually not sure I've _ever_ seen that happen).

The problem with RAID is when no one is looking to see if the RAID system had to correct blocks--once you see more than a couple of RAID corrections happen, it is time to replace a disk--if no one looks at the logs, then eventually, there will be double (or in the case of RAID6, triple) failure, and you will lose data.  A further problem with RAID is when some of the blocks are never read.  Any reasonable RAID controller will not only make the log of RAID corrections available (mine helpfully emails me when corrections happen), but will also have the option of scanning the entire RAID volume periodically to look for undetected individual block failures (my system does this scan 2x per week).  I've never used software RAID, so I don't know if these options are available (but I assume they are).  It would be suicidal to rely on any RAID system that didn't offer both logs of corrections as well as an easy way to scan every single block (including unused blocks) looking for unnoticed issues.

and in case of

RAID you have to have at least one full backup

and so it does not care me if disks are dying

Depends how many versioned backups you have, I suppose. It is possible to not notice RAID silenced bit-rot for a long time, especially with a lot of data.

I have a 5x1TB RAID5 (plus 1 hot spare) system (I suppose this is no longer considered a lot of data, but it was to me when I built it) that has _never_ had an unrecoverable problem--and I've now replaced every drive at least once (and I just started a migration to a 3x3TB RAID 5 w/ spare before any more fail)--I built my system in late 2003 (with 250GB drives), and the only time the RAID system has been down for more than a few minutes is when I migrate either drives or controller (or when I upgrade fedora).

-se
--
They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety.
                               --Benjamin Franklin, 1759

-- 
users mailing list
users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org