FWIW, we run into this same issue, and cannot get a good enough SSD: spinning ratio, and decided on simply running the journals on each (spinning) drive, for hosts that have 24 slots. The problem gets even worse when we're talking about some of the newer boxes.
Warren
Warren
Warren
On Wed, Sep 18, 2013 at 1:56 PM, Mike Dawson <mike.dawson@xxxxxxxxxxxx> wrote:
Joseph,
With properly architected failure domains and replication in a Ceph cluster, RAID1 has diminishing returns.
A well-designed CRUSH map should allow for failures at any level of your hierarchy (OSDs, hosts, racks, rows, etc) while protecting the data with a configurable number of copies.
That being said, losing a series of six OSDs is certainly a hassle and journals on a RAID1 set could help prevent that senerio.
But where do you stop? 3 monitors, 5, 7? RAID1 for OSDs, too? 3x replication, 4x, 10x? I suppose each operator gets to decide how far to chase the diminishing returns.
Thanks,
Mike Dawson
Co-Founder & Director of Cloud Architecture
Cloudapt LLC
On 9/18/2013 1:27 PM, Gruher, Joseph R wrote:
_______________________________________________
-----Original Message-----
From: ceph-users-bounces@xxxxxxxxxx.com [mailto:ceph-users-
bounces@xxxxxxxxxxxxxx] On Behalf Of Mike Dawson
you need to understand losing an SSD will cause
the loss of ALL of the OSDs which had their journal on the failed SSD.
First, you probably don't want RAID1 for the journal SSDs. It isn't particularly
needed for resiliency and certainly isn't beneficial from a throughput
perspective.
Sorry, can you clarify this further for me? If losing the SSD would cause losing all the OSDs journaling on it why would you not want to RAID it?
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com