Re: CRUSH rule for 3 replicas across 2 hosts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 20, 2015 at 10:46 AM, Colin Corr <colin@xxxxxxxxxxxxx> wrote:
> Greetings Cephers,
>
> I have hit a bit of a wall between the available documentation and my understanding of it with regards to CRUSH rules. I am trying to determine if it is possible to replicate 3 copies across 2 hosts, such that if one host is completely lost, at least 1 copy is available. The problem I am experiencing is that if I enable my host_rule for a data pool, the cluster never gets back to a clean state. All pgs in a pool with this rule will be stuck unclean.
>
> This is the rule:
>
> rule host_rule {
>         ruleset 2
>         type replicated
>         min_size 1
>         max_size 10
>         step take default
>         step chooseleaf firstn 0 type host
>         step emit
> }
>
> And if its pertinent, all nodes are running 0.80.9 on Ubuntu 14.04. Pool pg/pgp set to 2048, replicas 3. Tunables set to optimal.
>
> I assume that is happening because of simple math: 3 copies on 2 hosts. And crush is expecting a 3rd host to balance everything out since I defined host based. This rule runs fine on another 3 host test cluster. So, it would seem that the potential solutions are to change replication to 2 copies or add a 3rd OSD host. But, with all of the cool bucket types and rule options, I suspect I am missing something here. Alas, I am hoping there is some (not so obvious to me) CRUSH magic that could be applied here.

It's actually pretty hacky: you configure your CRUSH rule to return
two OSDs from each host, but set your size to 3. You'll want to test
this carefully with your installed version to make sure that works,
though — older CRUSH implementations would crash if you did that. :(

In slightly more detail, you'll need to change it so that instead of
using "chooseleaf" you "choose" 2 hosts, and then choose or chooseleaf
2 OSDs from each of those hosts. If you search the list archives for
CRUSH threads you'll find some other discussions about doing precisely
this, and I think the CRUSH documentation should cover the more
general bits of how the language works.
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux