2016-06-16 3:53 GMT+02:00 Christian Balzer <chibi@xxxxxxx>: > Gandalf, first read: > https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg29546.html > > And this thread by Nick: > https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg29708.html Interesting reading. Thanks. > Overly optimistic. > In an idle cluster with synthetic tests you might get sequential reads > that are around 150MB/s per HDD. > As for writes, think 80MB/s, again in an idle cluster. > > Any realistic, random I/O and you're looking at 50MB/s at most either way. > > So your storage nodes can't really saturate even a single 10Gb/s link in > real life situations. Ok. > Journal SSDs can improve on things, but that's mostly for IOPS. > In fact they easily become the bottleneck bandwidth wise and are so on > most of my storage nodes. > Because you'd need at least 2 400GB DC S3710 SSDs to get around 1GB/s > writes, or one link worth. I plan to use 1 or 2 SSD journal (probably, 1 SSD every 6 spinning disks) > Splitting things in cluster and public networks ONLY makes sense when your > storage node can saturate ALL the network bandwidth, which usually is only > the case when it comes to very expensive SSD/NVMe only nodes. This is not my case. > Going back to your original post, with a split network the latency in both > networks counts the same, as a client write will NOT be acknowledged until > it has reach the journal of all replicas, so having a higher latency > cluster network is counterproductive. Ok. > Or if you can start with a clean slate (including the clients), look at > Infiniband. > All my production clusters are running entirely IB (IPoIB currently) and > I'm very happy with the performance, latency and cost. Yes, i'll start with a brand new network. Acutally i'm testing with some old IB switches (DDR) and i'm not very happy, as IPoIB doesn't go over 8/9Gbit/s in a DDR. Additionally, CX4 cables used by DDR are... HUGE and very "hard" to bend in the rack. I don't know if QDR cables are thinner. Are you using QDR? I've seen a couple of mellanox used switches on ebay that seems to be ok for me. 36 QDR ports would be awesome but I don't have any IB knowledge. Could I keep the IB fabric unconfigured and use only IPoIB ? I can create a bonded (failover) IPoIB device on each node and add 2 or more IB cables between both switches. In a normal Ethernet network, these 2 cables must be joined in a LAG to avoid loops. Is infiniband able to manage this on their own ? I've never find a way to aggragate multiple ports. The real drawback with IB is that I have to add IB cards on each compute nodes, where my current compute nodes a 2 10GBaseT ports onboard. This add some costs.... _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com