On Tue, 18 Aug 2015 12:50:38 -0500 Mark Nelson wrote: [snap] > Probably the big question is what are the pain points? The most common > answer we get when asking folks what applications they run on top of > Ceph is "everything!". This is wonderful, but not helpful when trying > to figure out what performance issues matter most! :) > Well, the "everything" answer really is the one everybody who runs VMs backed by RBD for internal or external customers will give. I.e. no idea what is installed and no control over how it accesses the Ceph cluster. And even when you think you have a predictable use case it might not be true. As in, one of our Ceph installs backs a ganeti cluster with hundreds of VMs running 2 type of applications and from past experience I know their I/O patterns (nearly 100% write only, any reads usually can be satisfied from local or storage node pagecache). Thus the Ceph cluster was configured in a way that was optimized for this and it worked beautifully until: a) scrubs became too heavy (generating too many read IOPS while also invalidating page caches) and b) somebody thought a 3rd type of VM using Windows with IOPS that equal dozens of the other types would be a good idea. > IE, should we be focusing on IOPS? Latency? Finding a way to avoid > journal overhead for large writes? Are there specific use cases where > we should specifically be focusing attention? general iscsi? S3? > databases directly on RBD? etc. There's tons of different areas that we > can work on (general OSD threading improvements, different messenger > implementations, newstore, client side bottlenecks, etc) but all of > those things tackle different kinds of problems. > All of these except S3 would have a positive impact in my various use cases. However at the risk of sounding like a broken record, any time spent on these improvements before Ceph can recover from a scrub error fully autonomously (read: checksums) would be a waste in my book. All the speed in the world is pretty insignificant when a simple "ceph pg repair" (which is still in the Ceph docs w/o any qualification of what it actually does) has a good chance of wiping out good data "by imposing the primary OSD's view of the world on the replicas", to quote Greg. Regards, Christian -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Fusion Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com