On 07/11/2013 11:16 AM, Erwan Velu wrote:
On 11/07/2013 16:56, Mark Nelson wrote:
And We've now got part 3 out showing 128K FIO results:
http://ceph.com/performance-2/ceph-cuttlefish-vs-bobtail-part-3-128k-rbd-performance/
Hey Mark,
Hi!
As speaking about the 10GbE at the end of your document, I do have the
following questions for you.
What kind of network switch are you using ? That's not listed in the
hardware setup.
I cheated and directly connected the NICs in each node with 3' SFP+
cables. The bonding is just linux round-robin. This is probably about
as good as it gets from a throughput and latency perspective!
Did you configured something noticeable on it ?
Not too much beyond the crazines of getting a bridge working on top of
bonded 10GbE interfaces. I did tweak tcp reordering to help out:
net.ipv4.tcp_reordering=127
Did you estimated the network bandwidth between your hosts to see it you
reach 10GbE ?
I ran iperf on the bonded link and was sitting right around 2GB/s in
both directions with multiple streams. I also did some iperf tests from
individual VMs and was able to get similar (maybe slightly less)
throughput. Now that I think about it, I'm not sure I did a test with
parallel concurrent iperfs from all VMs, which would have been a good
test to do.
On my setup, I'm close to release a set of tools for benchmarking &
graphing thoses, I had the need to use a Jumbo frame at 7500. Is it your
case too ? If so, it would be lovely to understand your tuning.
At MTU=1500 I had only 6Gbps while 7500 gave me more than 9000.
I'm actually using MTU=1500. I suspect I can get away with it because
the cards are directly connected. I fiddled with increasing it up to
9000, but ran into some strange issues with the bonding/bridge and had
worse performance and stability so I returned it back to 1500. The
bonding/bridge setup was pretty finicky to get working.
That could be very valuable to other that does benchmarking or thoses
who want to optimize their setup.
I think the best advice here is to know your network, what your hardware
is capable of doing, and read the documentation in the kernel src. The
impression I've gotten over the years is that network tuning is as much
of an art as disk IO tuning. You really need to know what your software
is doing, what's happening at the hardware/driver level, and what's
happening at the switches. On big deployments, just dealing with
bisection bandwidth issues on supposed fat-tree topology switches can be
a project by itself!
Thanks for your great work,
Erwan
Thank you! I really like to hear that people are enjoying the articles.
Mark
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com