Hi Andres,
does the commando work with the original rule/crushmap?
___________________________________
Clyso GmbH - Ceph Foundation Member
support@xxxxxxxxx
https://www.clyso.com
Am 06.05.2021 um 15:21 schrieb Andres Rojas Guerrero:
Yes, my ceph version is Nautilus:
# ceph -v
ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9)
nautilus (stable)
First dump the crush map:
# ceph osd getcrushmap -o crush_map
Then, decompile the crush map:
# crushtool -d crush_map -o crush_map_d
Now, edit the crush rule and compile:
# crushtool -c crush_map_d -o crush_map_new
An finally test the mappings:
# crushtool -i crush_map_new --test --rule 2 --num-rep 7 --show-mappings
CRUSH rule 2 x 0 [-5,-45,-49,-47,-43,-41,-29]
*** Caught signal (Segmentation fault) **
in thread 7f2d717acb40 thread_name:crushtool
El 6/5/21 a las 14:13, Eugen Block escribió:
Interesting, I haven't had that yet with crushtool. Your ceph version
is Nautilus, right? And you did decompile the binary crushmap with
crushtool, correct? I don't know how to reproduce that.
Zitat von Andres Rojas Guerrero <a.rojas@xxxxxxx>:
I have this error when try to show mappings with crushtool:
# crushtool -i crush_map_new --test --rule 2 --num-rep 7
--show-mappings
CRUSH rule 2 x 0 [-5,-45,-49,-47,-43,-41,-29]
*** Caught signal (Segmentation fault) **
in thread 7f7f7a0ccb40 thread_name:crushtool
El 6/5/21 a las 13:47, Eugen Block escribió:
Yes it is possible, but you should validate it with crushtool before
injecting it to make sure the PGs land where they belong.
crushtool -i crushmap.bin --test --rule 2 --num-rep 7 --show-mappings
crushtool -i crushmap.bin --test --rule 2 --num-rep 7
--show-bad-mappings
If you don't get bad mappings and the 'show-mappings' confirms the PG
distribution by host you can inject it. But be aware of a lot of data
movement, that could explain the (temporarily) unavailable PGs. But to
make your cluster resilient against host failure you'll have to go
through that at some point.
https://docs.ceph.com/en/latest/rados/operations/crush-map-edits/
Zitat von Andres Rojas Guerrero <a.rojas@xxxxxxx>:
Hi, I try to make a new crush rule (Nautilus) in order take the new
correct_failure_domain to hosts:
"rule_id": 2,
"rule_name": "nxtcloudAFhost",
"ruleset": 2,
"type": 3,
"min_size": 3,
"max_size": 7,
"steps": [
{
"op": "set_chooseleaf_tries",
"num": 5
},
{
"op": "set_choose_tries",
"num": 100
},
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "choose_indep",
"num": 0,
"type": "host"
},
{
"op": "emit"
And I have changed the pool to this new crush rule:
# ceph osd pool set nxtcloudAF crush_rule nxtcloudAFhost
But suddenly the cephfs it's unavailable:
# ceph status
cluster:
id: c74da5b8-3d1b-483e-8b3a-739134db6cf8
health: HEALTH_WARN
11 clients failing to respond to capability release
2 MDSs report slow metadata IOs
1 MDSs report slow requests
And clients failing to respond:
HEALTH_WARN 11 clients failing to respond to capability release; 2
MDSs
report slow metadata IOs; 1 MDSs report slow requests
MDS_CLIENT_LATE_RELEASE 11 clients failing to respond to capability
release
mdsceph2mon03(mds.1): Client nxtcl3: failing to respond to
capability release client_id: 1524269
mdsceph2mon01(mds.0): Client nxtcl5:nxtclproAF failing to
respond to
I reversed the change, returning to the original crush rule, and all
it's Ok. My question if it's possible to change on fly the crush
rule of
a EC pool.
Thanks
El 5/5/21 a las 18:14, Andres Rojas Guerrero escribió:
Thanks, I will test it.
El 5/5/21 a las 16:37, Joachim Kraftmayer escribió:
Create a new crush rule with the correct failure domain, test it
properly and assign it to the pool(s).
--
*******************************************************
Andrés Rojas Guerrero
Unidad Sistemas Linux
Area Arquitectura Tecnológica
Secretaría General Adjunta de Informática
Consejo Superior de Investigaciones Científicas (CSIC)
Pinar 19
28006 - Madrid
Tel: +34 915680059 -- Ext. 990059
email: a.rojas@xxxxxxx
ID comunicate.csic.es: @50852720l:matrix.csic.es
*******************************************************
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
--
*******************************************************
Andrés Rojas Guerrero
Unidad Sistemas Linux
Area Arquitectura Tecnológica
Secretaría General Adjunta de Informática
Consejo Superior de Investigaciones Científicas (CSIC)
Pinar 19
28006 - Madrid
Tel: +34 915680059 -- Ext. 990059
email: a.rojas@xxxxxxx
ID comunicate.csic.es: @50852720l:matrix.csic.es
*******************************************************
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx