Re: CRUSH rule for EC 6+2 on 6-node cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hallo Dan, Bryan,
I have a rule similar to yours, for an 8+4 pool, with only difference that I replaced the second "choose" with "chooseleaf", which I understand should make no difference:

rule default.rgw.buckets.data {
        id 6
        type erasure
        min_size 3
        max_size 10
        step set_chooseleaf_tries 5
        step set_choose_tries 100
        step take default class big
        step choose indep 5 type host
        step chooseleaf indep 2 type osd
        step emit
}

I am on Nautilus 14.2.16 and while performing a maintenance the other day, I noticed 2 PGs were incomplete and caused troubles to some users. I then verified that (thanks Bryan for the command):

[cephmgr@cephAdmCT1.cephAdmCT1 clusterCT]$ for osd in $(ceph pg map 116.453 -f json | jq -r '.up[]'); do ceph osd find $osd | jq -r '.host' ; done | sort | uniq -c | sort -n -k1
      2 r2srv07.ct1.box.garr
      2 r2srv10.ct1.box.garr
      2 r3srv07.ct1.box.garr
      4 r1srv02.ct1.box.garr

  You see that 4 PGs were put on r1srv02.
May be this happened due to some temporary unavailability of the host at some point? As all my servers are now up and running, is there a way to force the placement rule to rerun?

  Thanks!

			Fulvio


Il 5/16/2021 11:40 PM, Dan van der Ster ha scritto:
Hi Bryan,

I had to do something similar, and never found a rule to place "up to"
2 chunks per host, so I stayed with the placement of *exactly* 2
chunks per host.

But I did this slightly differently to what you wrote earlier: my rule
chooses exactly 4 hosts, then chooses exactly 2 osds on each:

         type erasure
         min_size 3
         max_size 10
         step set_chooseleaf_tries 5
         step set_choose_tries 100
         step take default class hdd
         step choose indep 4 type host
         step choose indep 2 type osd
         step emit

If you really need the "up to 2" approach then maybe you can split
each host into two "host" crush buckets, with half the OSDs in each.
Then a normal host-wise rule should work.

Cheers, Dan


--
Fulvio Galeazzi
GARR-CSD Department
skype: fgaleazzi70
tel.: +39-334-6533-250

Attachment: smime.p7s
Description: Firma crittografica S/MIME

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux