Re: any recommendation of using EnhanceIO?

Christian Balzer <chibi@xxxxxxx> · Wed, 19 Aug 2015 11:31:54 +0900

On Tue, 18 Aug 2015 12:50:38 -0500 Mark Nelson wrote:

[snap]
> Probably the big question is what are the pain points?  The most common 
> answer we get when asking folks what applications they run on top of 
> Ceph is "everything!".  This is wonderful, but not helpful when trying 
> to figure out what performance issues matter most! :)
> 
Well, the "everything" answer really is the one everybody who runs VMs
backed by RBD for internal or external customers will give.
I.e. no idea what is installed and no control over how it accesses the
Ceph cluster.

And even when you think you have a predictable use case it might not be
true.
As in, one of our Ceph installs backs a ganeti cluster with hundreds of VMs
running 2 type of applications and from past experience I know their I/O
patterns (nearly 100% write only, any reads usually can be satisfied from
local or storage node pagecache). 
Thus the Ceph cluster was configured in a way that was optimized for this
and it worked beautifully until:
a) scrubs became too heavy (generating too many read IOPS while also
invalidating page caches) and
b) somebody thought a 3rd type of VM using Windows with IOPS that equal
dozens of the other types would be a good idea.

> IE, should we be focusing on IOPS?  Latency?  Finding a way to avoid 
> journal overhead for large writes?  Are there specific use cases where 
> we should specifically be focusing attention? general iscsi?  S3? 
> databases directly on RBD? etc.  There's tons of different areas that we 
> can work on (general OSD threading improvements, different messenger 
> implementations, newstore, client side bottlenecks, etc) but all of 
> those things tackle different kinds of problems.
> 
All of these except S3 would have a positive impact in my various use
cases.
However at the risk of sounding like a broken record, any time spent on
these improvements before Ceph can recover from a scrub error fully
autonomously (read: checksums) would be a waste in my book.

All the speed in the world is pretty insignificant when a simple 
"ceph pg repair" (which is still in the Ceph docs w/o any qualification of
what it actually does) has a good chance of wiping out good data "by
imposing the primary OSD's view of the world on the replicas", to quote
Greg.

Regards,

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com