Re: CRUSH rule for EC 6+2 on 6-node cluster

Fulvio Galeazzi <fulvio.galeazzi@xxxxxxx> · Thu, 20 May 2021 13:30:36 +0200

Hallo Dan, Bryan,
    I have a rule similar to yours, for an 8+4 pool, with only 
difference that I replaced the second "choose" with "chooseleaf", which 
I understand should make no difference:

rule default.rgw.buckets.data {
        id 6
        type erasure
        min_size 3
        max_size 10
        step set_chooseleaf_tries 5
        step set_choose_tries 100
        step take default class big
        step choose indep 5 type host
        step chooseleaf indep 2 type osd
        step emit
}

  I am on Nautilus 14.2.16 and while performing a maintenance the other 
day, I noticed 2 PGs were incomplete and caused troubles to some users. 
I then verified that (thanks Bryan for the command):

[cephmgr@cephAdmCT1.cephAdmCT1 clusterCT]$ for osd in $(ceph pg map 
116.453 -f json | jq -r '.up[]'); do ceph osd find $osd | jq -r '.host' 
; done | sort | uniq -c | sort -n -k1
      2 r2srv07.ct1.box.garr
      2 r2srv10.ct1.box.garr
      2 r3srv07.ct1.box.garr
      4 r1srv02.ct1.box.garr

  You see that 4 PGs were put on r1srv02.
May be this happened due to some temporary unavailability of the host at 
some point? As all my servers are now up and running, is there a way to 
force the placement rule to rerun?

  Thanks!

			Fulvio

Il 5/16/2021 11:40 PM, Dan van der Ster ha scritto:
Hi Bryan,

I had to do something similar, and never found a rule to place "up to"
2 chunks per host, so I stayed with the placement of *exactly* 2
chunks per host.

But I did this slightly differently to what you wrote earlier: my rule
chooses exactly 4 hosts, then chooses exactly 2 osds on each:

         type erasure
         min_size 3
         max_size 10
         step set_chooseleaf_tries 5
         step set_choose_tries 100
         step take default class hdd
         step choose indep 4 type host
         step choose indep 2 type osd
         step emit

If you really need the "up to 2" approach then maybe you can split
each host into two "host" crush buckets, with half the OSDs in each.
Then a normal host-wise rule should work.

Cheers, Dan

--
Fulvio Galeazzi
GARR-CSD Department
skype: fgaleazzi70
tel.: +39-334-6533-250

Attachment:
smime.p7s

Description: Firma crittografica S/MIME
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx