On Tue, Jan 4, 2011 at 10:02 AM, Roland Rabben <roland@xxxxxxxx> wrote: > Hi > I have been following your project for a long time and it looks like > Ceph is getting closer to release 1.0. Are you planning on calling > version 1.0 "production ready"? Version 1.0 will definitely be a production ready version. That's a nomenclature decision we/Sage made a long time ago. However, the possibility exists that we'll push back the release by some trivial-to-significant amount of time. Lest I dissuade you too much, Ceph is much closer to production-readiness than it has been in the past. We're internally working on products based on Ceph, or pieces of it (rbd), and most of our development time now is devoted to new extra-POSIX features and bug-fixes in new features rather than in old pieces of the code. You can take a look at the tracker's roadmap for a better idea of what still needs to get done; it's mostly disaster recovery stuff (eg fsck) and other kinds of administrative tools or performance enhancers: http://tracker.newdream.net/projects/ceph/roadmap > We have been holding off on testing Ceph in depth, but it looks like > we should start now that a stable production ready release is in > sight. For this I have a few questions that I am hoping the community > can answer before we start testing. :) > Once I have a Ceph distributed file system up and running, what is the > procedure to scale / increase total storage capacity? Any downtime > necessary for this? There's a wiki page about this which I believe covers it well: http://ceph.newdream.net/wiki/OSD_cluster_expansion/contraction There is no downtime. > Do I need to move any data around or "rebalance" data when I add new > storage nodes? (This is a huge problem with eg. Glusterfs) The system does need to rebalance, but it does so automatically -- no manual intervention is required once the OSDs have been added. We're still optimizing this portion but in general you should find that performance goes down during the process but it doesn't take too long and the system remains fully operational while it's rebalancing. Since Ceph employs consistent hashing it's moving a bounded portion of data around rather than moving all the data onto different storage nodes. > What are the expected and common maintenance tasks that are Ceph specific? Hmm, I can't come up with any. There are a few parameters that you might need to adjust as you scale up the number of machines you're running, but in a steady-state system Ceph is pretty self-managing, and becoming more so as we put in logic to auto-tune parameters. -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html