Re: Ceph 0.94 (and lower) performance on >1 hosts ??

Gregory Farnum <greg@xxxxxxxxxxx> · Thu, 23 Jul 2015 11:14:22 +0100

I'm not sure. It looks like Ceph and your disk controllers are doing
basically the right thing since you're going from 1GB/s to 420MB/s
when moving from dd to Ceph (the full data journaling cuts it in
half), but just fyi that dd task is not doing nearly the same thing as
Ceph does — you'd need to use directio or similar; the conv=fsync flag
means it will fsync the written data at the end of the run but not at
any intermediate point.

The change from 1 node to 2 cutting your performance so much is a bit
odd. I do note that
1 node: 420 MB/s each
2 nodes: 320 MB/s each
5 nodes: 275 MB/s each
so you appear to be reaching some kind of bound.

Your note that dd can do 2GB/s without networking makes me think that
you should explore that. As you say, network interrupts can be
problematic in some systems. The only thing I can think of that's been
really bad in the past is that some systems process all network
interrupts on cpu 0, and you probably want to make sure that it's
splitting them across CPUs.
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com