Re: Managing larger ceph clusters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm running a small cluster, but I'll chime in since nobody else has.

Cern had a presentation a while ago (dumpling time-frame) about their deployment.  They go over some of your questions: http://www.slideshare.net/Inktank_Ceph/scaling-ceph-at-cern

My philosophy on Config Management is that it should save me time.  If it's going to take me longer to write a recipe to do something, I'll just do it by hand. Since my cluster is small, there are many things I can do faster by hand.  This may or may not work for you, depending on your documentation / repeatability requirements.  For things that need to be documented, I'll usually write the recipe anyway (I accept Chef recipes as documentation).


For my clusters, I'm using Chef to setups all nodes and manage ceph.conf.  I manually manage my pools, CRUSH map, RadosGW users, and disk replacement.  I was using Chef to add new disks, but I ran into load problems due to my small cluster size.  I'm currently adding disks manually, to manage cluster load better.  As my cluster gets larger, that'll be less important.

I'm also doing upgrades manually, because it's less work than writing the Chef recipe to do a cluster upgrade.  Since Chef isn't cluster aware, it would be a a pain to make the recipe cluster aware enough to handle the upgrade.  And I figure if I stall long enough, somebody else will write it :-)  Ansible, with it's cluster wide coordination, looks like it would handle that a bit better.



On Wed, Apr 15, 2015 at 2:05 PM, Stillwell, Bryan <bryan.stillwell@xxxxxxxxxxx> wrote:
I'm curious what people managing larger ceph clusters are doing with
configuration management and orchestration to simplify their lives?

We've been using ceph-deploy to manage our ceph clusters so far, but
feel that moving the management of our clusters to standard tools would
provide a little more consistency and help prevent some mistakes that
have happened while using ceph-deploy.

We're looking at using the same tools we use in our OpenStack
environment (puppet/ansible), but I'm interested in hearing from people
using chef/salt/juju as well.

Some of the cluster operation tasks that I can think of along with
ideas/concerns I have are:

Keyring management
  Seems like hiera-eyaml is a natural fit for storing the keyrings.

ceph.conf
  I believe the puppet ceph module can be used to manage this file, but
  I'm wondering if using a template (erb?) might be better method to
  keeping it organized and properly documented.

Pool configuration
  The puppet module seems to be able to handle managing replicas and the
  number of placement groups, but I don't see support for erasure coded
  pools yet.  This is probably something we would want the initial
  configuration to be set up by puppet, but not something we would want
  puppet changing on a production cluster.

CRUSH maps
  Describing the infrastructure in yaml makes sense.  Things like which
  servers are in which rows/racks/chassis.  Also describing the type of
  server (model, number of HDDs, number of SSDs) makes sense.

CRUSH rules
  I could see puppet managing the various rules based on the backend
  storage (HDD, SSD, primary affinity, erasure coding, etc).

Replacing a failed HDD disk
  Do you automatically identify the new drive and start using it right
  away?  I've seen people talk about using a combination of udev and
  special GPT partition IDs to automate this.  If you have a cluster
  with thousands of drives I think automating the replacement makes
  sense.  How do you handle the journal partition on the SSD?  Does
  removing the old journal partition and creating a new one create a
  hole in the partition map (because the old partition is removed and
  the new one is created at the end of the drive)?

Replacing a failed SSD journal
  Has anyone automated recreating the journal drive using Sebastien
  Han's instructions, or do you have to rebuild all the OSDs as well?


http://www.sebastien-han.fr/blog/2014/11/27/ceph-recover-osds-after-ssd-jou
rnal-failure/


Adding new OSD servers
  How are you adding multiple new OSD servers to the cluster?  I could
  see an ansible playbook which disables nobackfill, noscrub, and
  nodeep-scrub followed by adding all the OSDs to the cluster being
  useful.

Upgrading releases
  I've found an ansible playbook for doing a rolling upgrade which looks
  like it would work well, but are there other methods people are using?


http://www.sebastien-han.fr/blog/2015/03/30/ceph-rolling-upgrades-with-ansi
ble/


Decommissioning hardware
  Seems like another ansible playbook for reducing the OSDs weights to
  zero, marking the OSDs out, stopping the service, removing the OSD ID,
  removing the CRUSH entry, unmounting the drives, and finally removing
  the server would be the best method here.  Any other ideas on how to
  approach this?


That's all I can think of right now.  Is there any other tasks that
people have run into that are missing from this list?

Thanks,
Bryan


This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux