Re: crushmap rule for not using all buckets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I am unaware of any way to accomplish having 1 pool with all 3 racks and another pool with only 2 of them. If you could put the same osd in 2 different roots or have a crush rule choose from 2 different roots, then this might work out. To my knowledge neither of these is possible.

What is your reasoning for using min_size of 1? I would highly recommend away from min_size = 1 for anything other than maintenance. Most people that use it believe that it means that writes will return after it has been written to min_size, but that is not the case. Writes happen to every osd in the PG before it acknowledges it is complete. Min_size is merely how many copies of a PG need to be up for writes and reads to still happen. Running with that set to 1 has a higher chance of data corruption than I'm comfortable with. Look through the mailing list archives to see more about this.


On Mon, Sep 4, 2017, 7:59 AM Andreas Herrmann <andreas@xxxxxxxx> wrote:
Hello,

I'm building a 5 server cluster over three rooms/racks. Each server has 8
960GB SSDs used as bluestore OSDs. Ceph version 12.1.2 is used.

        rack1: server1(mon) server2
        rack2: server3(mon) server4
        rack3: server5(mon)

The crushmap was built this way:

        ceph osd crush add-bucket rack1 rack
        ceph osd crush add-bucket rack2 rack
        ceph osd crush add-bucket rack3 rack

        ceph osd crush move rack1 root=default
        ceph osd crush move rack2 root=default
        ceph osd crush move rack3 root=default

        ceph osd crush move server1 rack=rack1
        ceph osd crush move server2 rack=rack2
        ...

        rule replicated_rule {
            id 0
            type replicated
            min_size 1
            max_size 10
            step take default
            step chooseleaf firstn 0 type rack
            step emit                     ^^^^

I had to manually change the replicated_rule in the crushmap. Is this change
also possible via cli?
        - step chooseleaf firstn 0 type host
        + step chooseleaf firstn 0 type rack

The first created pool has size=3/min_size=1 and every copy is in a different
rack. With this setup the cluster is loosing the capacity of 2 servers because
rack3 has only one server.

I'd like to add a second pool with size=2/min_size=1 and a rule to but copies
only in rack1 and rack2.

Is that possible or should I think about a complete different solution?

Thanks in advance,
Andreas
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux