After updating the CRUSH rule from
rule cephfs_ec {
id 1
type erasure
min_size 8
max_size 8
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default
step choose indep 4 type host
step choose indep 2 type osd
step emit
}
id 1
type erasure
min_size 8
max_size 8
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default
step choose indep 4 type host
step choose indep 2 type osd
step emit
}
to
rule cephfs_ec {
id 1
type erasure
min_size 8
max_size 12
#step set_chooseleaf_tries 6
step set_choose_tries 100
step take default
step choose indep 6 type host
step choose indep 2 type osd
step emit
}
id 1
type erasure
min_size 8
max_size 12
#step set_chooseleaf_tries 6
step set_choose_tries 100
step take default
step choose indep 6 type host
step choose indep 2 type osd
step emit
}
upmap is not complaining anymore and is working with the six hosts.
Seems like CRUSH does not stop picking a host after the first four with the first rule and is complaining when it gets the fifth host.
Is this a bug or intended behaviour?
Regards
Eric
On Tue, Sep 17, 2019 at 3:55 PM Eric Dold <dold.eric@xxxxxxxxx> wrote:
With ceph 14.2.4 it's the same.The upmap balancer is not working.Any ideas?On Wed, Sep 11, 2019 at 11:32 AM Eric Dold <dold.eric@xxxxxxxxx> wrote:Hello,I'm running ceph 14.2.3 on six hosts with each four osds. I did recently upgrade this from four hosts.The cluster is running fine. But i get this in my logs:Sep 11 11:02:41 ceph1 ceph-mon[1333]: 2019-09-11 11:02:41.953 7f26023a6700 -1 verify_upmap number of buckets 5 exceeds desired 4
Sep 11 11:02:41 ceph1 ceph-mon[1333]: 2019-09-11 11:02:41.953 7f26023a6700 -1 verify_upmap number of buckets 5 exceeds desired 4
Sep 11 11:02:41 ceph1 ceph-mon[1333]: 2019-09-11 11:02:41.953 7f26023a6700 -1 verify_upmap number of buckets 5 exceeds desired 4It looks like the balancer is not doing any work.Here are some infos about the cluster:ceph1 ~ # ceph osd crush rule ls
replicated_rule
cephfs_ec
ceph1 ~ # ceph osd crush rule dump replicated_rule
{
"rule_id": 0,
"rule_name": "replicated_rule",
"ruleset": 0,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
}
ceph1 ~ # ceph osd crush rule dump cephfs_ec
{
"rule_id": 1,
"rule_name": "cephfs_ec",
"ruleset": 1,
"type": 3,
"min_size": 8,
"max_size": 8,
"steps": [
{
"op": "set_chooseleaf_tries",
"num": 5
},
{
"op": "set_choose_tries",
"num": 100
},
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "choose_indep",
"num": 4,
"type": "host"
},
{
"op": "choose_indep",
"num": 2,
"type": "osd"
},
{
"op": "emit"
}
]
}
ceph1 ~ # ceph osd erasure-code-profile ls
default
isa_62
ceph1 ~ # ceph osd erasure-code-profile get default
k=2
m=1
plugin=jerasure
technique=reed_sol_van
ceph1 ~ # ceph osd erasure-code-profile get isa_62
crush-device-class=
crush-failure-domain=osd
crush-root=default
k=6
m=2
plugin=isa
technique=reed_sol_vanThe idea with four hosts was that the ec profile should take two osds on each host for the eight buckets.Now with six hosts i guess two hosts will have tow buckets on two osds and four hosts will have each one bucket for a piece of data.Any idea how to resolve this?RegardsEric
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx