Re: OSD and Journal Files

Mike Dawson <mike.dawson@xxxxxxxxxxxx> · Wed, 18 Sep 2013 13:56:34 -0400

Joseph,

With properly architected failure domains and replication in a Ceph 
cluster, RAID1 has diminishing returns.

A well-designed CRUSH map should allow for failures at any level of your 
hierarchy (OSDs, hosts, racks, rows, etc) while protecting the data with 
a configurable number of copies.

That being said, losing a series of six OSDs is certainly a hassle and 
journals on a RAID1 set could help prevent that senerio.

But where do you stop? 3 monitors, 5, 7? RAID1 for OSDs, too? 3x 
replication, 4x, 10x? I suppose each operator gets to decide how far to 
chase the diminishing returns.

Thanks,

Mike Dawson
Co-Founder & Director of Cloud Architecture
Cloudapt LLC

On 9/18/2013 1:27 PM, Gruher, Joseph R wrote:

-----Original Message-----
From: ceph-users-bounces@xxxxxxxxxxxxxx [mailto:ceph-users-
bounces@xxxxxxxxxxxxxx] On Behalf Of Mike Dawson

you need to understand losing an SSD will cause
the loss of ALL of the OSDs which had their journal on the failed SSD.

First, you probably don't want RAID1 for the journal SSDs. It isn't particularly
needed for resiliency and certainly isn't beneficial from a throughput
perspective.

Sorry, can you clarify this further for me?  If losing the SSD would cause losing all the OSDs journaling on it why would you not want to RAID it?

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com