This is how my crush tree including shadow hierarchies looks like (a mess :): https://pastebin.com/iCLbi4Up Every device class has its own tree. Starting with mimic, this is automatic when creating new device classes. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Eugen Block <eblock@xxxxxx> Sent: 30 September 2020 08:43:47 To: Frank Schilder Cc: Marc Roos; ceph-users Subject: Re: Re: hdd pg's migrating when converting ssd class osd's Interesting, I also did this test on an upgraded cluster (L to N). I'll repeat the test on a native Nautilus to see it for myself. Zitat von Frank Schilder <frans@xxxxxx>: > Somebody on this list posted a script that can convert pre-mimic > crush trees with buckets for different types of devices to crush > trees with device classes with minimal data movement (trying to > maintain IDs as much as possible). Don't have a thread name right > now, but could try to find it tomorrow. > > I can check tomorrow how our crush tree unfolds. Basically, for > every device class there is a full copy (shadow hierarchy) for each > device class with its own weights etc. > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Marc Roos <M.Roos@xxxxxxxxxxxxxxxxx> > Sent: 29 September 2020 22:19:33 > To: eblock; Frank Schilder > Cc: ceph-users > Subject: RE: Re: hdd pg's migrating when converting ssd > class osd's > > Yes correct this is coming from Luminous or maybe even Kraken. How does > a default crush tree look like in mimic or octopus? Or is there some > manual how to bring this to the new 'default'? > > > -----Original Message----- > Cc: ceph-users > Subject: Re: Re: hdd pg's migrating when converting ssd > class osd's > > Are these crush maps inherited from pre-mimic versions? I have > re-balanced SSD and HDD pools in mimic (mimic deployed) where one device > class never influenced the placement of the other. I have mixed hosts > and went as far as introducing rbd_meta, rbd_data and such classes to > sub-divide even further (all these devices have different perf specs). > This worked like a charm. When adding devices of one class, only pools > in this class were ever affected. > > As far as I understand, starting with mimic, every shadow class defines > a separate tree (not just leafs/OSDs). Thus, device classes are > independent of each other. > > > > ________________________________________ > Sent: 29 September 2020 20:54:48 > To: eblock > Cc: ceph-users > Subject: Re: hdd pg's migrating when converting ssd class > osd's > > Yes correct, hosts have indeed both ssd's and hdd's combined. Is this > not more of a bug then? I would assume the goal of using device classes > is that you separate these and one does not affect the other, even the > host weight of the ssd and hdd class are already available. The > algorithm should just use that instead of the weight of the whole host. > Or is there some specific use case, where these classes combined is > required? > > > -----Original Message----- > Cc: ceph-users > Subject: *****SPAM***** Re: Re: hdd pg's migrating when > converting ssd class osd's > > They're still in the same root (default) and each host is member of both > device-classes, I guess you have a mixed setup (hosts c01/c02 have both > HDDs and SSDs)? I don't think this separation is enough to avoid > remapping even if a different device-class is affected (your report > confirms that). > > Dividing the crush tree into different subtrees might help here but I'm > not sure if that's really something you need. You might also just deal > with the remapping as long as it doesn't happen too often, I guess. On > the other hand, if your setup won't change (except adding more OSDs) you > might as well think about a different crush tree. It really depends on > your actual requirements. > > We created two different subtrees when we got new hardware and it helped > us a lot moving the data only once to the new hardware avoiding multiple > remappings, now the older hardware is our EC environment except for some > SSDs on those old hosts that had to stay in the main subtree. So our > setup is also very individual but it works quite nice. > :-) > > > Zitat von : > >> I have practically a default setup. If I do a 'ceph osd crush tree >> --show-shadow' I have a listing like this[1]. I would assume from the >> hosts being listed within the default~ssd and default~hdd, they are >> separate (enough)? >> >> >> [1] >> root default~ssd >> host c01~ssd >> .. >> .. >> host c02~ssd >> .. >> root default~hdd >> host c01~hdd >> .. >> host c02~hdd >> .. >> root default >> >> >> >> >> -----Original Message----- >> To: ceph-users@xxxxxxx >> Subject: Re: hdd pg's migrating when converting ssd class > >> osd's >> >> Are all the OSDs in the same crush root? I would think that since the >> crush weight of hosts change as soon as OSDs are out it impacts the >> whole crush tree. If you separate the SSDs from the HDDs logically > (e.g. >> different bucket type in the crush tree) the ramapping wouldn't affect > >> the HDDs. >> >> >> >> >>> I have been converting ssd's osd's to dmcrypt, and I have noticed >>> that >> >>> pg's of pools are migrated that should be (and are?) on hdd class. >>> >>> On a healthy ok cluster I am getting, when I set the crush reweight >>> to >> >>> 0.0 of a ssd osd this: >>> >>> 17.35 10415 0 0 9907 0 >>> 36001743890 0 0 3045 3045 >>> active+remapped+backfilling 2020-09-27 12:55:49.093054 >>> active+remapped+83758'20725398 >>> 83758:100379720 [8,14,23] 8 [3,14,23] 3 >>> 83636'20718129 2020-09-27 00:58:07.098096 83300'20689151 2020-09-24 >>> 21:42:07.385360 0 >>> >>> However osds 3,14,23,8 are all hdd osd's >>> >>> Since this is a cluster from Kraken/Luminous, I am not sure if the >>> device class of the replicated_ruleset[1] was set when the pool 17 >>> was >> >>> created. >>> Weird thing is that all pg's of this pool seem to be on hdd osd[2] >>> >>> Q. How can I display the definition of 'crush_rule 0' at the time of >>> the pool creation? (To be sure it had already this device class hdd >>> configured) >>> >>> >>> >>> [1] >>> [@~]# ceph osd pool ls detail | grep 'pool 17' >>> pool 17 'rbd' replicated size 3 min_size 2 crush_rule 0 object_hash >>> rjenkins pg_num 64 pgp_num 64 autoscale_mode warn last_change 83712 >>> flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd >>> >>> >>> [@~]# ceph osd crush rule dump replicated_ruleset { >>> "rule_id": 0, >>> "rule_name": "replicated_ruleset", >>> "ruleset": 0, >>> "type": 1, >>> "min_size": 1, >>> "max_size": 10, >>> "steps": [ >>> { >>> "op": "take", >>> "item": -10, >>> "item_name": "default~hdd" >>> }, >>> { >>> "op": "chooseleaf_firstn", >>> "num": 0, >>> "type": "host" >>> }, >>> { >>> "op": "emit" >>> } >>> ] >>> } >>> >>> [2] >>> [@~]# for osd in `ceph pg dump pgs| grep '^17' | awk '{print $17" >> "$19}' >>> | grep -oE '[0-9]{1,2}'| sort -u -n`; do ceph osd crush >>> | get-device-class >>> osd.$osd ; done | sort -u >>> dumped pgs >>> hdd > > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an > email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx