Re: Expanding a ceph cluster with ansible

Sebastien Han <seb@xxxxxxxxxx> · Mon, 22 Jun 2015 15:09:43 +0200

Hi Bryan,

It shouldn’t be a problem for ceph-ansible to expand a cluster even if it wasn’t deployed with it.
I believe this requires a bit of tweaking on the ceph-ansible, but it’s not much.
Can you elaborate on what went wrong and perhaps how you configured ceph-ansible?

As far as I understood, you haven’t been able to grow the size of your cluster by adding new disks/nodes?
Is this statement correct?

One more thing, why don’t you use ceph-ansible entirely to do the provisioning and life cycle management of your cluster? :)

> On 18 Jun 2015, at 00:14, Stillwell, Bryan <bryan.stillwell@xxxxxxxxxxx> wrote:
> 
> I've been working on automating a lot of our ceph admin tasks lately and am
> pretty pleased with how the puppet-ceph module has worked for installing
> packages, managing ceph.conf, and creating the mon nodes.  However, I don't
> like the idea of puppet managing the OSDs.  Since we also use ansible in my
> group, I took a look at ceph-ansible to see how it might be used to
> complete
> this task.  I see examples for doing a rolling update and for doing an os
> migration, but nothing for adding a node or multiple nodes at once.  I
> don't
> have a problem doing this work, but wanted to check with the community if
> any one has experience using ceph-ansible for this?
> 
> After a lot of trial and error I found the following process works well
> when
> using ceph-deploy, but it's a lot of steps and can be error prone
> (especially if you have old cephx keys that haven't been removed yet):
> 
> # Disable backfilling and scrubbing to prevent too many performance
> # impacting tasks from happening at the same time.  Maybe adding norecover
> # to this list might be a good idea so only peering happens at first.
> ceph osd set nobackfill
> ceph osd set noscrub
> ceph osd set nodeep-scrub
> 
> # Zap the disks to start from a clean slate
> ceph-deploy disk zap dnvrco01-cephosd-025:sd{b..y}
> 
> # Prepare the disks.  I found sleeping between adding each disk can help
> # prevent performance problems.
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdh:/dev/sdb; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdi:/dev/sdb; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdj:/dev/sdb; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdk:/dev/sdc; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdl:/dev/sdc; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdm:/dev/sdc; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdn:/dev/sdd; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdo:/dev/sdd; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdp:/dev/sdd; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdq:/dev/sde; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdr:/dev/sde; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sds:/dev/sde; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdt:/dev/sdf; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdu:/dev/sdf; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdv:/dev/sdf; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdw:/dev/sdg; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdx:/dev/sdg; sleep 15
> ceph-deploy osd prepare dnvrco01-cephosd-025:sdy:/dev/sdg; sleep 15
> 
> # Weight in the new OSDs.  We set 'osd_crush_initial_weight = 0' to prevent
> # them from being added in during the prepare step.  Maybe a longer weight
> # in the last step would make this step unncessary.
> ceph osd crush reweight osd.450 1.09; sleep 60
> ceph osd crush reweight osd.451 1.09; sleep 60
> ceph osd crush reweight osd.452 1.09; sleep 60
> ceph osd crush reweight osd.453 1.09; sleep 60
> ceph osd crush reweight osd.454 1.09; sleep 60
> ceph osd crush reweight osd.455 1.09; sleep 60
> ceph osd crush reweight osd.456 1.09; sleep 60
> ceph osd crush reweight osd.457 1.09; sleep 60
> ceph osd crush reweight osd.458 1.09; sleep 60
> ceph osd crush reweight osd.459 1.09; sleep 60
> ceph osd crush reweight osd.460 1.09; sleep 60
> ceph osd crush reweight osd.461 1.09; sleep 60
> ceph osd crush reweight osd.462 1.09; sleep 60
> ceph osd crush reweight osd.463 1.09; sleep 60
> ceph osd crush reweight osd.464 1.09; sleep 60
> ceph osd crush reweight osd.465 1.09; sleep 60
> ceph osd crush reweight osd.466 1.09; sleep 60
> ceph osd crush reweight osd.467 1.09; sleep 60
> 
> # Once all the OSDs are added to the cluster, allow the backfill process to
> # begin.
> ceph osd unset nobackfill
> 
> # Then once cluster is healthy again, re-enable scrubbing
> ceph osd unset noscrub
> ceph osd unset nodeep-scrub
> 
> 
> This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Cheers.
–––– 
Sébastien Han 
Senior Cloud Architect 

"Always give 100%. Unless you're giving blood."

Mail: seb@xxxxxxxxxx 
Address: 11 bis, rue Roquépine - 75008 Paris

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com