Hi Sage,
I have a similar question, I need 2 replicas (one on each rack) and I would like to know whether the following rule always save primary on rack1?
Sherry
I have a similar question, I need 2 replicas (one on each rack) and I would like to know whether the following rule always save primary on rack1?
rule data { ruleset 0 type replicated min_size 2 max_size 2 step take rack1 step chooseleaf firstn 1 type host step emit step take rack2 step chooseleaf firstn 1 type host step emit }If so, I was wondering if you could tell the following rule will do the same thing by spreading primary and replica across rack1 and rack2?
rule data { ruleset 0 type replicated min_size 2 max_size 2 step take row1Thanks in advance
step choose firstn 2 type rack
step chooseleaf firstn 1 type host step emit }
Sherry
On Tuesday, January 21, 2014 7:00 AM, Sage Weil <sage@xxxxxxxxxxx> wrote:
On Mon, 20 Jan 2014, Arnulf Heimsbakk wrote:
> Hi,
>
> I'm trying to understand the CRUSH algorithm and how it distribute data.
> Let's say I simplify a small datacenter setup and map it up
> hierarchically in the crush map as show below.
>
> root datacenter
> / \
> / \
> / \
> a b room
> / | \ / | \
> a1 a2 a3 b1 b2 b3 rack
> | | | | | |
> h1 h2 h3 h4 h5 h6 host
>
> I want 4 copies of all data in my pool, configured on pool level. 2
> copies in each room. And I want to be sure not 2 copies resides in the
> same rack when there is no HW failures.
>
> Will the chooseleaf rule below ensure this placement?
>
> step take root
> step chooseleaf firstn 0 type room
> step emit
This won't ensure the 2 copies in each room are in different racks.
> Or do I have to specify this more, like
>
> step take root
> step choose firstn 2 type room
> step chooseleaf firstn 2 type rack
> step emit
I think this is what you want. The thing it won't do is decide to put 4
replicas in room b when room a goes down completely... but at that scale,
that is generally not what you want anyway.
> Or even more, like?
>
> step take a
> step choose firstn 2 type rack
> step chooseleaf firstn 1 type host
> step emit
> step take b
> step choose firstn 2 type rack
> step chooseleaf firstn 1 type host
> step emit
>
> Is there difference in failure behaviour in the different configurations?
This would work too, but assumes you only have 2 rooms, and that you
always want the primary copy to be in room a (which means the reads go
there). The previous rule with spread the primary responsibility across
both rooms.
sage
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> Hi,
>
> I'm trying to understand the CRUSH algorithm and how it distribute data.
> Let's say I simplify a small datacenter setup and map it up
> hierarchically in the crush map as show below.
>
> root datacenter
> / \
> / \
> / \
> a b room
> / | \ / | \
> a1 a2 a3 b1 b2 b3 rack
> | | | | | |
> h1 h2 h3 h4 h5 h6 host
>
> I want 4 copies of all data in my pool, configured on pool level. 2
> copies in each room. And I want to be sure not 2 copies resides in the
> same rack when there is no HW failures.
>
> Will the chooseleaf rule below ensure this placement?
>
> step take root
> step chooseleaf firstn 0 type room
> step emit
This won't ensure the 2 copies in each room are in different racks.
> Or do I have to specify this more, like
>
> step take root
> step choose firstn 2 type room
> step chooseleaf firstn 2 type rack
> step emit
I think this is what you want. The thing it won't do is decide to put 4
replicas in room b when room a goes down completely... but at that scale,
that is generally not what you want anyway.
> Or even more, like?
>
> step take a
> step choose firstn 2 type rack
> step chooseleaf firstn 1 type host
> step emit
> step take b
> step choose firstn 2 type rack
> step chooseleaf firstn 1 type host
> step emit
>
> Is there difference in failure behaviour in the different configurations?
This would work too, but assumes you only have 2 rooms, and that you
always want the primary copy to be in room a (which means the reads go
there). The previous rule with spread the primary responsibility across
both rooms.
sage
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com