On Thu, Aug 14, 2014 at 12:47 AM, Christian Balzer <chibi at gol.com> wrote: > > Hello, > > On Tue, 12 Aug 2014 10:53:21 -0700 Craig Lewis wrote: > >> That's a low probability, given the number of disks you have. I would've >> taken that bet (with backups). As the number of OSDs goes up, the >> probability of multiple simultaneous failures goes up, and slowly >> becomes a bad bet. >> > > I must be very unlucky then. ^o^ > As in, I've had dual disk failures in a set of 8 disks 3 times now > (within the last 6 years). > And twice that lead to data loss, once with RAID5 (no surprise there) and > once with RAID10 (unlucky failure of neighboring disks). > Granted, that was with consumer HDDs and the last one with rather well > aged ones, too. But there you go. Yeah, I'd say you're unlucky, unless you're running a pretty large cluster. I usually run my 8 disk arrays in RAID-Z2 / RAID6 though; 5 disks is my limit for RAID-Z1 / RAID5. I've been lucky so far. No double failures in my RAID-Z1 / RAID5 arrays, and no triple failures in my RAID-Z2 / RAID6 arrays. After 15 years and hundreds of arrays, I should've had at least one. I have had several double failures in RAID1, but none of those were important. If this isn't a big cluster, I would suspect that you have a vibration or power issue. Both are known to cause premature death in HDDs. Of course, rebuilding a degraded RAID is also a well known cause of premature HDD death. > As for backups, those are for when somebody does something stupid and > deletes stuff they shouldn't have. > A storage system should be a) up all the time and b) not loose data. I completely agree, but never trust it. Over the years, I've used backups to recover when: - I do something stupid - My developers do something stupid - Hardware does something stupid - Manufacturer firmware does something stupid - Manufacturer Tech support tells me to do something stupid - My datacenter does something stupid - My power companies do something stupid I've lost data from a software RAID0, all the way up to a quadruply-redundant multi-million dollar hardware storage array. Regardless of the promises printed on the box, it's the contingency plans that keep the paychecks coming. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140814/9a709048/attachment.htm>