Re: Best way to change bucket hierarchy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Frank,

Interesting info about the EC profile.  I do have an EC pool, but I noticed the following when I dumped the profile:

# ceph osd erasure-code-profile get ec22
crush-device-class=hdd
crush-failure-domain=host
crush-root=default
jerasure-per-chunk-alignment=false
k=2
m=2
plugin=jerasure
technique=reed_sol_van
w=8
#

Which says that the failure domain of the EC profile is also set to host.  Looks like I need to change the EC profile, too, but since it associated with the pool, maybe I can’t do that after pool creation?  Or…. Since it the property is named “crush-failure-domain”, it’s automatically inherited from the crush profile, so I don’t have to do anything?

Thanks,

George


On Jun 4, 2020, at 1:51 AM, Frank Schilder <frans@xxxxxx<mailto:frans@xxxxxx>> wrote:

Hi George,

for replicated rules you can simply create a new crush rule with the new failure domain set to chassis and change any pool's crush rule to this new one. If you have EC pools, then the chooseleaf needs to be edited by hand. I did this before as well. (A really unfortunate side effect is, that the EC profile attached to the pool goes out of sync with the crush map and there is nothing one can do about that. This is annoying yet harmless.)

The intend of doing these changes while norebalance is set is

- to avoid unnecessary data movement due to successive changes happening step by step and
- to make sure peering is successful before starting to move data.

I believe OSDs peer a bit faster with norebalance set and there is then a shorter interrupt to ongoing I/O (no I/O happens to a PG during peering).

Yes, if you safe the old crush map, you can undo everything. It is a good idea to have a backup also just for reference and to compare before and after.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Kyriazis, George <george.kyriazis@xxxxxxxxx<mailto:george.kyriazis@xxxxxxxxx>>
Sent: 04 June 2020 00:58:20
To: Frank Schilder
Cc: ceph-users
Subject: Re: Best way to change bucket hierarchy

Thanks Frank,

I don’t have too much experience editing crush rules, but I assume the chooseleaf step would also have to change to:

       step chooseleaf firstn 0 type chassis

Correct?  Is that the only other change that is needed?  It looks like the rule change can happen both inside and outside the “norebalance” setting (again with CLI commands), but is it safer to do it inside (ie. while not rebalancing)?

If I keep a backup of the crush rule map (with “ceph osd getcrushmap”), I assume I can restore the old map if something goes bad?

Thanks again!

George



On Jun 3, 2020, at 5:24 PM, Frank Schilder <frans@xxxxxx<mailto:frans@xxxxxx>> wrote:

You can use the command-line without editing the crush map. Look at the documentation of commands like

ceph osd crush add-bucket ...
ceph osd crush move ...

Before starting this, set "ceph osd set norebalance" and unset after you are happy with the crush tree. Let everything peer. You should see misplaced objects and remapped PGs, but no degraded objects or PGs.

Do this only when cluster is helth_ok, otherwise things can get really complicated.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Kyriazis, George <george.kyriazis@xxxxxxxxx<mailto:george.kyriazis@xxxxxxxxx>>
Sent: 03 June 2020 22:45:11
To: ceph-users
Subject:  Best way to change bucket hierarchy

Helo,

I have a live ceph cluster, and I’m in the need of modifying the bucket hierarchy.  I am currently using the default crush rule (ie. keep each replica on a different host).  My need is to add a “chassis” level, and keep replicas on a per-chassis level.

>From what I read in the documentation, I would have to edit the crush file manually, however this sounds kinda scary for a live cluster.

Are there any “best known methods” to achieve that goal without messing things up?

In my current scenario, I have one host per chassis, and planning on later adding nodes where there would be >1 hosts per chassis. It looks like “in theory” there wouldn’t be a need for any data movement after the crush map changes.  Will reality match theory?  Anything else I need to watch out for?

Thank you!

George

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx>


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux