Re: increasing stability

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 05/30/2013 11:06 PM, Sage Weil wrote:
> Hi everyone,

Hi again,

> I wanted to mention just a few things on this thread.

Thank you for taking the time.

> The first is obvious: we are extremely concerned about stability.  
> However, Ceph is a big project with a wide range of use cases, and it is 
> difficult to cover them all.  For that reason, Inktank is (at least for 
> the moment) focusing in specific areas (rados, librbd, rgw) and certain 
> platforms.  We have a number of large production customers and 
> non-customers now who have stable environments, and we are committed to a 
> solid experience for them.

And I really appreciate that.

> We are investing heavily in testing infrastructure and automation tools to 
> maximize our ability to test with limited resources.  Our lab is currently 
> around 14 racks, with most of the focus now on utilizing those resources 
> as effectively as possible.  The teuthology testing framework continues to 
> evolve and our test suites continue to grow.  Unfortunatley, this has been 
> an area where it has been difficult for others to contribute.  We are 
> eager to talk to anyone who is interested in helping.

what we as a community can do is provide feedback with our test-cases.
and I think you're doing a great job of supporting the community.

> Overall, the cuttlefish release has gone much more smoothly than bobtail 
> did.  That said, there are a few lingering problems, particularly with the 
> monitor's use of leveldb.  We're waiting on some QA on the pending fixes 
> now before we push out a 0.61.3 that I believe will resolve the remaining 
> problems for most users.

I upgraded to 0.61.4 on a production system today, and it went all
smooth. I was really nervous things could blow up.
I can't add monitors though. I have another thread going on, so don't
bother. What I want to say is: This needs to work. In my mind the mon
issues must all be fixed. If I were Inktank I would freeze all further
features, and fix all bugs (I know this is boring, but
business-critical) until ceph gets so stable that there are no more
complaints by users. You are so close.
Right now when I promote ceph and people ask me: but is it stable? I
still have to say: It's almost there.

> However, as overall adoption of ceph increases, we move past the critical 
> bugs and start seeing a larger number of "long-tail" issues that affect 
> smaller sets of users.  Overall this is a good thing, even if it means a 
> harder job for the engineers to triage and track down obscure problems. 

I realize this is very hard, and maybe very boring.

> The mailing list is going to attract a high number of bug reports because 
> that's what it is for.  Although we believe the quality is getting better 
> based on our internal testing and our commercial interactions, we'd like 
> to turn this into a more metrics driven analysis.  We welcome any ideas on 
> how to do this, as the obvious ideas (like counting bugs) tend to scale 
> with the number of users, and we have no way of telling how many users 
> there really are.

I really want to see you succeed big time. Ceph is one of the best
things that have come to my mind since a long time. I don't want to tell
you what to do, because you will know it better than me. All I am saying
is: If you make it very robust, people will not stop buying support
contracts.

> Thanks-
> sage

Thank you, sage. We all owe you more than a 'thank you'.
wogri
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux