Performance questions (how original, I know)

Christian Balzer <chibi@xxxxxxx> · Mon, 16 Dec 2013 17:42:31 +0900

Hello,

new to Ceph, not new to replicated storage.
Simple test cluster with 2 identical nodes running Debian Jessie, thus ceph
0.48. And yes, I very much prefer a distro supported package.
Single mon and osd1 on node a, osd2 on node b.
1GbE direct interlink between the nodes, used exclusively for this setup.
Bog standard, minimum configuration, declaring a journal but that's on the
same backing storage.
The backing storage can do this locally (bonnie++):
Version  1.97       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
irt03            8G           89267  21 60474  15           267049  37 536.9  12
Latency                        4792ms     245ms             44908us     113ms

And this with the a 20GB rbd (formatted the same way, ext4, as the test
above) mounted on the node that hosts the mon and osd1:
Version  1.97       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
irt03            8G           11525   2  5562   1           48221   6 167.3   3
Latency                        5073ms    2912ms               321ms    2841ms

I'm looking at Ceph/RBD to store VM volumes with ganeti and these numbers
frankly scare me. 
Watching the traffic with ethstats I never saw something higher than this
during writes (on node a):
  eth2:   72.32 Mb/s In   127.99 Mb/s Out -   8035.4 p/s In   11649.5 p/s Out

I assume the traffic coming back in is replica stuff from node b, right?
What prevented it to use more than about 13% of the network link capacity?

Aside from that cringeworthy drop to 15% of the backing storage speed (and
network link) which I presume might be salvageable by using a SSD journal
I'm more than puzzled by the read speed. 
For starters I would have assumed that in this 2 replica setup all data is
present on the local node a and Ceph would be smart enough to get it all
locally. But even if it was talking to both nodes a and b (or just b) I
would have expected something in the 100MB/s range. 

Any insights would be much appreciated.

Regards,

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com