Re: Performance questions (how original, I know)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/16/2013 02:42 AM, Christian Balzer wrote:

Hello,

Hi Christian!


new to Ceph, not new to replicated storage.
Simple test cluster with 2 identical nodes running Debian Jessie, thus ceph
0.48. And yes, I very much prefer a distro supported package.

I know you'd like to use the distro package, but 0.48 is positively ancient at this point. There's been a *lot* of fixes/changes since then. If it makes you feel better, our current professionally supported release is based on dumpling.

Single mon and osd1 on node a, osd2 on node b.
1GbE direct interlink between the nodes, used exclusively for this setup.
Bog standard, minimum configuration, declaring a journal but that's on the
same backing storage.
The backing storage can do this locally (bonnie++):
Version  1.97       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
irt03            8G           89267  21 60474  15           267049  37 536.9  12
Latency                        4792ms     245ms             44908us     113ms

And this with the a 20GB rbd (formatted the same way, ext4, as the test
above) mounted on the node that hosts the mon and osd1:
Version  1.97       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
irt03            8G           11525   2  5562   1           48221   6 167.3   3
Latency                        5073ms    2912ms               321ms    2841ms

I'm looking at Ceph/RBD to store VM volumes with ganeti and these numbers
frankly scare me.
Watching the traffic with ethstats I never saw something higher than this
during writes (on node a):
   eth2:   72.32 Mb/s In   127.99 Mb/s Out -   8035.4 p/s In   11649.5 p/s Out

I assume the traffic coming back in is replica stuff from node b, right?
What prevented it to use more than about 13% of the network link capacity?

Aside from that cringeworthy drop to 15% of the backing storage speed (and
network link) which I presume might be salvageable by using a SSD journal
I'm more than puzzled by the read speed.
For starters I would have assumed that in this 2 replica setup all data is
present on the local node a and Ceph would be smart enough to get it all
locally. But even if it was talking to both nodes a and b (or just b) I
would have expected something in the 100MB/s range.

Ceph typically always reads data from the primary OSD, so wherever the primary is located, that's where it will read from. The good news is that this gives you a better probability of spreading yours reads out over the whole cluster. The bad news is that you have more network traffic to deal with.


Any insights would be much appreciated.

With 0.48 it's kind of tough to make any recommendations because I frankly don't remember exactly everything that's changed since then. You'll probably want to make sure that syncfs is being used, and you probably will want to play around with enabling/disabling the filestore flusher and maybe turning journal aio on. Looks like RBD cache was included in 0.46, so you can try enabling that, but it had performance issues with sequential writes before cuttlefish.

At least you'll be on a relatively modern kernel!


Regards,

Christian


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux