Re: Expanding a ceph cluster with ansible

"Stillwell, Bryan" <bryan.stillwell@xxxxxxxxxxx> · Tue, 23 Jun 2015 18:52:18 -0400

Sébastien,

Nothing has gone wrong with using it in this way, it just has to do with
my lack
of experience with ansible/ceph-ansible.  I'm learning both now, but would
love
if there were more documentation around using them.  For example this
documentation around using ceph-deploy is pretty good, and I was hoping for
something equivalent for ceph-ansible:

http://ceph.com/docs/master/rados/deployment/

With that said, I'm wondering what tweaks do you think would be needed to
get
ceph-ansible working on an existing cluster?

Also to answer your other questions, I haven't tried expanding the cluster
with
ceph-ansible yet.  I'm playing around with it in vagrant/virtualbox, and
it looks
pretty awesome so far!  If everything goes well, I'm not against
revisiting the
choice of puppet-ceph and replacing it with ceph-ansible.

One other question, how well does ceph-ansible handle replacing a failed
HDD
(/dev/sdo) that has the journal at the beginning or middle of an SSD
(/dev/sdd2)?

Thanks,
Bryan

On 6/22/15, 7:09 AM, "Sebastien Han" <seb@xxxxxxxxxx> wrote:

>Hi Bryan,
>
>It shouldn¹t be a problem for ceph-ansible to expand a cluster even if it
>wasn¹t deployed with it.
>I believe this requires a bit of tweaking on the ceph-ansible, but it¹s
>not much.
>Can you elaborate on what went wrong and perhaps how you configured
>ceph-ansible?
>
>As far as I understood, you haven¹t been able to grow the size of your
>cluster by adding new disks/nodes?
>Is this statement correct?
>
>One more thing, why don¹t you use ceph-ansible entirely to do the
>provisioning and life cycle management of your cluster? :)
>
>> On 18 Jun 2015, at 00:14, Stillwell, Bryan
>><bryan.stillwell@xxxxxxxxxxx> wrote:
>>
>> I've been working on automating a lot of our ceph admin tasks lately
>>and am
>> pretty pleased with how the puppet-ceph module has worked for installing
>> packages, managing ceph.conf, and creating the mon nodes.  However, I
>>don't
>> like the idea of puppet managing the OSDs.  Since we also use ansible
>>in my
>> group, I took a look at ceph-ansible to see how it might be used to
>> complete
>> this task.  I see examples for doing a rolling update and for doing an
>>os
>> migration, but nothing for adding a node or multiple nodes at once.  I
>> don't
>> have a problem doing this work, but wanted to check with the community
>>if
>> any one has experience using ceph-ansible for this?
>>
>> After a lot of trial and error I found the following process works well
>> when
>> using ceph-deploy, but it's a lot of steps and can be error prone
>> (especially if you have old cephx keys that haven't been removed yet):
>>
>> # Disable backfilling and scrubbing to prevent too many performance
>> # impacting tasks from happening at the same time.  Maybe adding
>>norecover
>> # to this list might be a good idea so only peering happens at first.
>> ceph osd set nobackfill
>> ceph osd set noscrub
>> ceph osd set nodeep-scrub
>>
>> # Zap the disks to start from a clean slate
>> ceph-deploy disk zap dnvrco01-cephosd-025:sd{b..y}
>>
>> # Prepare the disks.  I found sleeping between adding each disk can help
>> # prevent performance problems.
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdh:/dev/sdb; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdi:/dev/sdb; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdj:/dev/sdb; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdk:/dev/sdc; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdl:/dev/sdc; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdm:/dev/sdc; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdn:/dev/sdd; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdo:/dev/sdd; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdp:/dev/sdd; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdq:/dev/sde; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdr:/dev/sde; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sds:/dev/sde; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdt:/dev/sdf; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdu:/dev/sdf; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdv:/dev/sdf; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdw:/dev/sdg; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdx:/dev/sdg; sleep 15
>> ceph-deploy osd prepare dnvrco01-cephosd-025:sdy:/dev/sdg; sleep 15
>>
>> # Weight in the new OSDs.  We set 'osd_crush_initial_weight = 0' to
>>prevent
>> # them from being added in during the prepare step.  Maybe a longer
>>weight
>> # in the last step would make this step unncessary.
>> ceph osd crush reweight osd.450 1.09; sleep 60
>> ceph osd crush reweight osd.451 1.09; sleep 60
>> ceph osd crush reweight osd.452 1.09; sleep 60
>> ceph osd crush reweight osd.453 1.09; sleep 60
>> ceph osd crush reweight osd.454 1.09; sleep 60
>> ceph osd crush reweight osd.455 1.09; sleep 60
>> ceph osd crush reweight osd.456 1.09; sleep 60
>> ceph osd crush reweight osd.457 1.09; sleep 60
>> ceph osd crush reweight osd.458 1.09; sleep 60
>> ceph osd crush reweight osd.459 1.09; sleep 60
>> ceph osd crush reweight osd.460 1.09; sleep 60
>> ceph osd crush reweight osd.461 1.09; sleep 60
>> ceph osd crush reweight osd.462 1.09; sleep 60
>> ceph osd crush reweight osd.463 1.09; sleep 60
>> ceph osd crush reweight osd.464 1.09; sleep 60
>> ceph osd crush reweight osd.465 1.09; sleep 60
>> ceph osd crush reweight osd.466 1.09; sleep 60
>> ceph osd crush reweight osd.467 1.09; sleep 60
>>
>> # Once all the OSDs are added to the cluster, allow the backfill
>>process to
>> # begin.
>> ceph osd unset nobackfill
>>
>> # Then once cluster is healthy again, re-enable scrubbing
>> ceph osd unset noscrub
>> ceph osd unset nodeep-scrub
>>
>>
>> This E-mail and any of its attachments may contain Time Warner Cable
>>proprietary information, which is privileged, confidential, or subject
>>to copyright belonging to Time Warner Cable. This E-mail is intended
>>solely for the use of the individual or entity to which it is addressed.
>>If you are not the intended recipient of this E-mail, you are hereby
>>notified that any dissemination, distribution, copying, or action taken
>>in relation to the contents of and attachments to this E-mail is
>>strictly prohibited and may be unlawful. If you have received this
>>E-mail in error, please notify the sender immediately and permanently
>>delete the original and any copy of this E-mail and any printout.
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>Cheers.
>
>Sébastien Han
>Senior Cloud Architect
>
>"Always give 100%. Unless you're giving blood."
>
>Mail: seb@xxxxxxxxxx
>Address: 11 bis, rue Roquépine - 75008 Paris
>

This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com