On 05/30/2013 11:06 PM, Sage Weil wrote: > Hi everyone, Hi again, > I wanted to mention just a few things on this thread. Thank you for taking the time. > The first is obvious: we are extremely concerned about stability. > However, Ceph is a big project with a wide range of use cases, and it is > difficult to cover them all. For that reason, Inktank is (at least for > the moment) focusing in specific areas (rados, librbd, rgw) and certain > platforms. We have a number of large production customers and > non-customers now who have stable environments, and we are committed to a > solid experience for them. And I really appreciate that. > We are investing heavily in testing infrastructure and automation tools to > maximize our ability to test with limited resources. Our lab is currently > around 14 racks, with most of the focus now on utilizing those resources > as effectively as possible. The teuthology testing framework continues to > evolve and our test suites continue to grow. Unfortunatley, this has been > an area where it has been difficult for others to contribute. We are > eager to talk to anyone who is interested in helping. what we as a community can do is provide feedback with our test-cases. and I think you're doing a great job of supporting the community. > Overall, the cuttlefish release has gone much more smoothly than bobtail > did. That said, there are a few lingering problems, particularly with the > monitor's use of leveldb. We're waiting on some QA on the pending fixes > now before we push out a 0.61.3 that I believe will resolve the remaining > problems for most users. I upgraded to 0.61.4 on a production system today, and it went all smooth. I was really nervous things could blow up. I can't add monitors though. I have another thread going on, so don't bother. What I want to say is: This needs to work. In my mind the mon issues must all be fixed. If I were Inktank I would freeze all further features, and fix all bugs (I know this is boring, but business-critical) until ceph gets so stable that there are no more complaints by users. You are so close. Right now when I promote ceph and people ask me: but is it stable? I still have to say: It's almost there. > However, as overall adoption of ceph increases, we move past the critical > bugs and start seeing a larger number of "long-tail" issues that affect > smaller sets of users. Overall this is a good thing, even if it means a > harder job for the engineers to triage and track down obscure problems. I realize this is very hard, and maybe very boring. > The mailing list is going to attract a high number of bug reports because > that's what it is for. Although we believe the quality is getting better > based on our internal testing and our commercial interactions, we'd like > to turn this into a more metrics driven analysis. We welcome any ideas on > how to do this, as the obvious ideas (like counting bugs) tend to scale > with the number of users, and we have no way of telling how many users > there really are. I really want to see you succeed big time. Ceph is one of the best things that have come to my mind since a long time. I don't want to tell you what to do, because you will know it better than me. All I am saying is: If you make it very robust, people will not stop buying support contracts. > Thanks- > sage Thank you, sage. We all owe you more than a 'thank you'. wogri _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com