Hallo Dan, Nathan, thanks for your replies and apologies for my silence.Sorry I had made a typo... the rule is really 6+4. And to reply to Nathan's message, the rule was built like this in anticipation of getting additional servers, at which point in time I will relax the "2 chunks per OSD" part.
[cephmgr@cephAdmPA1.cephAdmPA1 ~]$ ceph osd pool get default.rgw.buckets.data erasure_code_profile
erasure_code_profile: ec_6and4_big[cephmgr@cephAdmPA1.cephAdmPA1 ~]$ ceph osd erasure-code-profile get ec_6and4_big
crush-device-class=big crush-failure-domain=osd crush-root=default jerasure-per-chunk-alignment=false k=6 m=4 plugin=jerasure technique=reed_sol_van w=8 Indeed, Dan: [cephmgr@cephAdmPA1.cephAdmPA1 ~]$ ceph osd dump | grep upmap | grep 116.453 pg_upmap_items 116.453 [76,49,129,108]Don't think I ever set such an upmap myself. Do you think it would be good to try and remove all upmaps, let the upmap balancer do its magic, and check again?
Thanks! Fulvio On 20/05/2021 18:59, Dan van der Ster wrote:
Hold on: 8+4 needs 12 osds but you only show 10 there. Shouldn't you choose 6 type host and then chooseleaf 2 type osd?.. DanOn Thu, May 20, 2021, 1:30 PM Fulvio Galeazzi <fulvio.galeazzi@xxxxxxx <mailto:fulvio.galeazzi@xxxxxxx>> wrote:Hallo Dan, Bryan, I have a rule similar to yours, for an 8+4 pool, with only difference that I replaced the second "choose" with "chooseleaf", which I understand should make no difference: rule default.rgw.buckets.data { id 6 type erasure min_size 3 max_size 10 step set_chooseleaf_tries 5 step set_choose_tries 100 step take default class big step choose indep 5 type host step chooseleaf indep 2 type osd step emit } I am on Nautilus 14.2.16 and while performing a maintenance the other day, I noticed 2 PGs were incomplete and caused troubles to some users. I then verified that (thanks Bryan for the command): [cephmgr@cephAdmCT1.cephAdmCT1 clusterCT]$ for osd in $(ceph pg map 116.453 -f json | jq -r '.up[]'); do ceph osd find $osd | jq -r '.host' ; done | sort | uniq -c | sort -n -k1 2 r2srv07.ct1.box.garr 2 r2srv10.ct1.box.garr 2 r3srv07.ct1.box.garr 4 r1srv02.ct1.box.garr You see that 4 PGs were put on r1srv02. May be this happened due to some temporary unavailability of the host at some point? As all my servers are now up and running, is there a way to force the placement rule to rerun? Thanks! Fulvio Il 5/16/2021 11:40 PM, Dan van der Ster ha scritto: > Hi Bryan, > > I had to do something similar, and never found a rule to place "up to" > 2 chunks per host, so I stayed with the placement of *exactly* 2 > chunks per host. > > But I did this slightly differently to what you wrote earlier: my rule > chooses exactly 4 hosts, then chooses exactly 2 osds on each: > > type erasure > min_size 3 > max_size 10 > step set_chooseleaf_tries 5 > step set_choose_tries 100 > step take default class hdd > step choose indep 4 type host > step choose indep 2 type osd > step emit > > If you really need the "up to 2" approach then maybe you can split > each host into two "host" crush buckets, with half the OSDs in each. > Then a normal host-wise rule should work. > > Cheers, Dan >
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx