Re: CRUSH rule for 3 replicas across 2 hosts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 04/20/2015 11:02 AM, Robert LeBlanc wrote:
> We have a similar issue, but we wanted three copies across two racks. Turns out, that we increased size to 4 and left min_size at 2. We didn't want to risk having less than two copies and if we only had thee copies, losing a rack would block I/O. Once we expand to a third rack, we will adjust our rule and go to size 3. Searching the mailing list and docs proved difficult, so I'll include my rule so that you can use it as a basis. You should be able to just change rack to host and host to osd. If you want to keep only three copies, the "extra" OSD chosen just won't be used as Gregory mentions. Technically this rule should have "max_size 4", but I won't set a pool over 4 copies so I didn't change it here.
> 
> If anyone has a better way of writing this rule (or one that would work for both a two rack and 3+ rack configuration as mentioned above), I'd be open to it. This is the first rule that I've really wrote on my own.
> 
> rule replicated_ruleset {
>         ruleset 0
>         type replicated
>         min_size 1
>         max_size 10
>         step take default
>         step choose firstn 2 type rack
>         step chooseleaf firstn 2 type host
>         step emit
> }

Thank you Robert. Your example was very helpful. I didn't realize you could nest the choose and chooseleaf steps together. I thought chooseleaf effectively handled that for you already. This makes a bit more sense now.

My rule looks like this now:
rule host_rule {
        ruleset 2
        type replicated
        min_size 2
        max_size 3
        step take default
        step choose firstn 2 type host
        step chooseleaf firstn 2 type osd
        step emit
}

And the cluster is reporting the pool as clean, finally. If I understand correctly, we will now potentially have as many as 4 replicas of an object in the pool, 2 on each host.

> On Mon, Apr 20, 2015 at 11:50 AM, Gregory Farnum <greg@xxxxxxxxxxx <mailto:greg@xxxxxxxxxxx>> wrote:

>     It's actually pretty hacky: you configure your CRUSH rule to return
>     two OSDs from each host, but set your size to 3. You'll want to test
>     this carefully with your installed version to make sure that works,
>     though — older CRUSH implementations would crash if you did that. :(
> 
>     In slightly more detail, you'll need to change it so that instead of
>     using "chooseleaf" you "choose" 2 hosts, and then choose or chooseleaf
>     2 OSDs from each of those hosts. If you search the list archives for
>     CRUSH threads you'll find some other discussions about doing precisely
>     this, and I think the CRUSH documentation should cover the more
>     general bits of how the language works.
>     -Greg

Thank you Greg, I had trouble searching for discussions related to this. The Google was not being friendly, or I wasn't issuing a good query. My understanding of choose vs. chooseleaf and using multiple choose~ steps in a rule will send me back to the docs for the remainder of my day.

Thanks,

Colin




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux