RE: [Ceph] Managing crushmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Greg,

the information you requested:
dumpling platform : ceph version 0.67.9 (ba340a97c3dafc9155023da8d515eecc675c619a)

See attached the decompiled crushmap
(created today by "ceph osd getcrushmap -o crushmap_current" and crushtool -d crushmap_current -o crushmap_current.txt)
and the result of:
- "ceph osd crush rule dump" in order to get rule_ids
- "ceph osd dump | grep pool" in order to get pools characteristics 

The ceph commands I ran
for crushmap
	ceph osd getcrushmap -o crushmap_current
	crushtool -d crushmap_current -o crushmap_current.txt
	vi crushmap_current.txt ==> crushmap_V2.4.txt
	crushtool -c crushmap_V2.4.txt -o crushmap_V2.4
	ceph osd setcrushmap -i crushmap_V2.4
	
the file crushmap_current provided today is hence the decompilation of crushmap_V2.4

for pool
	rados mkpool .rgw.fastrgw
	ceph osd pool set .rgw.fastrgw crush_ruleset 50 ==> crush ruleset 50 does not exist
	ceph osd pool set .rgw.fastrgw crush_ruleset 4 ==> set pool 47 crush_ruleset to 4c

I notice this morning that the following command works
	rados mkpool testpool 0 50

It seems that the command ceph osd pool set <pool> crush_ruleset <ruleset_number> is not working correctly.

To go on my tests, I will delete the pools and recreate them using "rados mkpool" adding directly the right ruleset.

For the firefly platform (ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)),
we only checked that the behavior was identical after ingesting a rule with a ruleset which would not be equal to the rule_id. 

Note:
the pools and rules names have changed since I sent the message but the behavior remains the same
Using calamari to modify pool parameters, the GUI is listing the ruleset but it has no effect (no update, no error displayed)
Using Inkscope to modify pool parameters, we can force the ruleset number but it has no effect (no update, no error displayed)
We noticed that the ceph-rest-api returns 200 OK even if the update is not set.

Best regards

-----Message d'origine-----
De : Gregory Farnum [mailto:greg@xxxxxxxxxxx] 
Envoyé : mercredi 11 juin 2014 19:03
À : CHEVALIER Ghislain IMT/OLPS
Cc : ceph-devel@xxxxxxxxxxxxxxx
Objet : Re: [Ceph] Managing crushmap

That doesn't sound right. Can you supply your decompiled CRUSH map,
the exact commands you ran against the ceph cluster, and the exact
version(s) you ran the test against?
-Greg

Software Engineer #42 @ http://inktank.com | http://ceph.com


On Wed, Jun 11, 2014 at 2:17 AM,  <ghislain.chevalier@xxxxxxxxxx> wrote:
> Hi all,
>
> Context :
> Lab Platform
> Ceph dumpling and firefly
> Ubuntu 12.04 LTS
>
> I encountered a strange behavior managing the crushmap on a dumpling and a firefly ceph platform.
>
> I built a crushmap, adding 2 specific rules (fastrule and slowrule) in order to experiment tiering.
> I used "ceph osd get|setcrushmap" and crushtool to extract and ingest the updated crushmap in the system.
> I have to precise that I respectively associated 50 and 51 as ruleset numbers for the 2 new rules.
> The ingestion was good; I checked it by "ceph osd crush rule dump"
>
> I created 2 pools (fastpool and slowpool)
> As indicated in the doc, I tried to associate fastpool to ruleset 50 by "ceph osd pool set fastpool crush_ruleset 50"
> an error occurred : rule 50 doesn't exist
> As the rule_id of fastrule is 4, I did "ceph osd pool set fastpool crush_ruleset 4" and it works but it's not a correct behavior.
> If a ceph admin wants to manage the crushmap, he doesn't have to check the rule_id (that he cannot set) before updating the attribute crush_ruleset of pools.
> The way to manage the rules if the ruleset not the rule_id.
>
> I also tested that reingesting a crushmap (after for example changing the sequence of the rules in the decompiled file) causes a global update of the rule_ids.
> I can't imagine the impacts on a platform.
>
> Did someone encounter this behavior?
> Did I misunderstand how to configure a crushmap?
>
> Best regards
>
>
>
>
>
> - - - - - - - - - - - - - - - - -
> Ghislain Chevalier
> ORANGE/OLNC/OLPS/ASE/DAPI/CSE
> Architecte de services de stockage
> Storage Service Architect
>  +33299124432
> ghislain.chevalier@xxxxxxxxxx
>  Pensez à l'Environnement avant d'imprimer ce message !
>
>
> _________________________________________________________________________________________________________________________
>
> Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and delete this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
> Thank you.
>

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1

# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 osd.5
device 6 osd.6
device 7 osd.7
device 8 osd.8
device 9 osd.9

# types
type 0 osd
type 1 host
type 2 platform
type 3 datacenter
type 4 root
type 5 appclient
type 10 diskclass
type 50 appclass

# buckets
host p-sbceph13 {
	id -13		# do not change unnecessarily
	# weight 0.020
	alg straw
	hash 0	# rjenkins1
	item osd.0 weight 0.010
	item osd.5 weight 0.010
}
host p-sbceph14 {
	id -14		# do not change unnecessarily
	# weight 0.020
	alg straw
	hash 0	# rjenkins1
	item osd.1 weight 0.010
	item osd.6 weight 0.010
}
host p-sbceph15 {
	id -15		# do not change unnecessarily
	# weight 0.020
	alg straw
	hash 0	# rjenkins1
	item osd.2 weight 0.010
	item osd.7 weight 0.010
}
host p-sbceph12 {
	id -12		# do not change unnecessarily
	# weight 0.020
	alg straw
	hash 0	# rjenkins1
	item osd.3 weight 0.010
	item osd.8 weight 0.010
}
host p-sbceph11 {
	id -11		# do not change unnecessarily
	# weight 0.020
	alg straw
	hash 0	# rjenkins1
	item osd.4 weight 0.010
	item osd.9 weight 0.010
}
platform sandbox {
	id -3		# do not change unnecessarily
	# weight 0.100
	alg straw
	hash 0	# rjenkins1
	item p-sbceph13 weight 0.020
	item p-sbceph14 weight 0.020
	item p-sbceph15 weight 0.020
	item p-sbceph12 weight 0.020
	item p-sbceph11 weight 0.020
}
datacenter nanterre {
	id -2		# do not change unnecessarily
	# weight 0.100
	alg straw
	hash 0	# rjenkins1
	item sandbox weight 0.100
}
root default {
	id -1		# do not change unnecessarily
	# weight 0.100
	alg straw
	hash 0	# rjenkins1
	item nanterre weight 0.100
}
appclass fastrgw {
	id -501		# do not change unnecessarily
	# weight 0.050
	alg straw
	hash 0	# rjenkins1
	item osd.0 weight 0.010
	item osd.1 weight 0.010
	item osd.2 weight 0.010
	item osd.3 weight 0.010
	item osd.4 weight 0.010
}
appclass slowrgw {
	id -502		# do not change unnecessarily
	# weight 0.050
	alg straw
	hash 0	# rjenkins1
	item osd.5 weight 0.010
	item osd.6 weight 0.010
	item osd.7 weight 0.010
	item osd.8 weight 0.010
	item osd.9 weight 0.010
}
appclient apprgw {
	id -50		# do not change unnecessarily
	# weight 0.100
	alg straw
	hash 0	# rjenkins1
	item fastrgw weight 0.050
	item slowrgw weight 0.050
}
appclass faststd {
	id -511		# do not change unnecessarily
	# weight 0.050
	alg straw
	hash 0	# rjenkins1
	item osd.0 weight 0.010
	item osd.1 weight 0.010
	item osd.2 weight 0.010
	item osd.3 weight 0.010
	item osd.4 weight 0.010
}
appclass slowstd {
	id -512		# do not change unnecessarily
	# weight 0.050
	alg straw
	hash 0	# rjenkins1
	item osd.5 weight 0.010
	item osd.6 weight 0.010
	item osd.7 weight 0.010
	item osd.8 weight 0.010
	item osd.9 weight 0.010
}
appclient appstd {
	id -51		# do not change unnecessarily
	# weight 0.100
	alg straw
	hash 0	# rjenkins1
	item faststd weight 0.050
	item slowstd weight 0.050
}
root approot {
	id -5		# do not change unnecessarily
	# weight 0.200
	alg straw
	hash 0	# rjenkins1
	item apprgw weight 0.100
	item appstd weight 0.100
}
diskclass fastsata {
	id -110		# do not change unnecessarily
	# weight 0.050
	alg straw
	hash 0	# rjenkins1
	item osd.0 weight 0.010
	item osd.1 weight 0.010
	item osd.2 weight 0.010
	item osd.3 weight 0.010
	item osd.4 weight 0.010
}
diskclass slowsata {
	id -120		# do not change unnecessarily
	# weight 0.050
	alg straw
	hash 0	# rjenkins1
	item osd.5 weight 0.010
	item osd.6 weight 0.010
	item osd.7 weight 0.010
	item osd.8 weight 0.010
	item osd.9 weight 0.010
}
root diskroot {
	id -100		# do not change unnecessarily
	# weight 0.100
	alg straw
	hash 0	# rjenkins1
	item fastsata weight 0.050
	item slowsata weight 0.050
}

# rules
rule data {
	ruleset 0
	type replicated
	min_size 1
	max_size 10
	step take slowstd
	step chooseleaf firstn 0 type osd
	step emit
}
rule metadata {
	ruleset 1
	type replicated
	min_size 1
	max_size 10
	step take slowstd
	step chooseleaf firstn 0 type osd
	step emit
}
rule rbd {
	ruleset 2
	type replicated
	min_size 1
	max_size 10
	step take faststd
	step chooseleaf firstn 0 type osd
	step emit
}
rule test {
	ruleset 30
	type replicated
	min_size 1
	max_size 10
	step take fastsata
	step chooseleaf firstn 0 type osd
	step emit
}
rule fastrgw {
	ruleset 50
	type replicated
	min_size 1
	max_size 10
	step take fastrgw
	step chooseleaf firstn 0 type osd
	step emit
}
rule slowrgw {
	ruleset 51
	type replicated
	min_size 1
	max_size 10
	step take slowrgw
	step chooseleaf firstn 0 type osd
	step emit
}

# end crush map
[
    { "rule_id": 0,
      "rule_name": "data",
      "ruleset": 0,
      "type": 1,
      "min_size": 1,
      "max_size": 10,
      "steps": [
            { "op": "take",
              "item": -512},
            { "op": "chooseleaf_firstn",
              "num": 0,
              "type": "osd"},
            { "op": "emit"}]},
    { "rule_id": 1,
      "rule_name": "metadata",
      "ruleset": 1,
      "type": 1,
      "min_size": 1,
      "max_size": 10,
      "steps": [
            { "op": "take",
              "item": -512},
            { "op": "chooseleaf_firstn",
              "num": 0,
              "type": "osd"},
            { "op": "emit"}]},
    { "rule_id": 2,
      "rule_name": "rbd",
      "ruleset": 2,
      "type": 1,
      "min_size": 1,
      "max_size": 10,
      "steps": [
            { "op": "take",
              "item": -511},
            { "op": "chooseleaf_firstn",
              "num": 0,
              "type": "osd"},
            { "op": "emit"}]},
    { "rule_id": 3,
      "rule_name": "test",
      "ruleset": 30,
      "type": 1,
      "min_size": 1,
      "max_size": 10,
      "steps": [
            { "op": "take",
              "item": -110},
            { "op": "chooseleaf_firstn",
              "num": 0,
              "type": "osd"},
            { "op": "emit"}]},
    { "rule_id": 4,
      "rule_name": "fastrgw",
      "ruleset": 50,
      "type": 1,
      "min_size": 1,
      "max_size": 10,
      "steps": [
            { "op": "take",
              "item": -501},
            { "op": "chooseleaf_firstn",
              "num": 0,
              "type": "osd"},
            { "op": "emit"}]},
    { "rule_id": 5,
      "rule_name": "slowrgw",
      "ruleset": 51,
      "type": 1,
      "min_size": 1,
      "max_size": 10,
      "steps": [
            { "op": "take",
              "item": -502},
            { "op": "chooseleaf_firstn",
              "num": 0,
              "type": "osd"},
            { "op": "emit"}]}]
pool 0 'data' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 512 pgp_num 64 last_change 8174 owner 0 crash_replay_interval 45
pool 1 'metadata' rep size 2 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0
pool 2 'rbd' rep size 3 min_size 1 crush_ruleset 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 8169 owner 0
pool 3 '.rgw.root' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 429 owner 0
pool 4 '.rgw.control' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 431 owner 0
pool 5 '.rgw' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 433 owner 0
pool 6 '.rgw.gc' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 434 owner 0
pool 7 '.log' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 2930 owner 0
pool 8 '.intent-log' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 866 owner 0
pool 9 '.usage' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 868 owner 0
pool 10 '.users' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 870 owner 0
pool 11 '.users.email' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 872 owner 0
pool 12 '.users.swift' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 874 owner 0
pool 13 '.users.uid' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 876 owner 0
pool 15 '.rgw.buckets.index' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 891 owner 0
pool 16 '.rgw.buckets' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 893 owner 0
pool 30 '.log150000' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 2935 owner 0
pool 44 'yar58' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 200 pgp_num 200 last_change 6852 owner 0
pool 45 'test.rules' rep size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 8171 owner 0
pool 47 '.rgw.fastrgw' rep size 3 min_size 1 crush_ruleset 4 object_hash rjenkins pg_num 167 pgp_num 167 last_change 8210 owner 0
pool 48 '.rgw.slowrgw' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 250 pgp_num 250 last_change 8149 owner 0
pool 49 '.rgw.fastrgw.index' rep size 3 min_size 1 crush_ruleset 4 object_hash rjenkins pg_num 167 pgp_num 167 last_change 8184 owner 0
pool 50 '.rgw.slowrgw.index' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 250 pgp_num 250 last_change 8155 owner 0

[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux