Hello, On Thu, 29 Sep 2016 07:37:45 -0700 Gerald Spencer wrote: > Greetings new world of Ceph, > > Long story short, at work we perform high throughput volumetric imaging and > create a decent chunk of data per machine. We are about to bring the next > generation of our system online and the IO requirements will outpace our > current storage solution (jbod using zfs on Linux). We are currently > searching for a template-able scale out solution that we can add as we > bring each new system online starting in a few months. There are several > quotes floating around from all of the big players, but the buy in on > hardware and software is unsettling as they are a hefty chunk of change. > > The current performance we are currently estimating is per machine: What constitutes a "machine" in your setup, a client that is interacting with the data on the Ceph cluster? How many clients initially, so we can see how a Ceph based solution compares to the offers from the big boys. > - simultaneous 30Gbps read and 30Gbps write So at least 40Gb/s interfaces on those. Is that a continuous number/requirement? Or do you have working sets that are being dealt with, saved and then off to the next one? If so, how large are these sets? > - 180 TB capacity (roughly a two day buffer into a public cloud) > Capacity is easy with Ceph. Sequential writes and reads (the later with tuning) as pretty straightforward. Small, unbuffered writes, especially from a single client are the most challenging, but that's obviously not what you're doing. > > So our question is: are these types of performances possible using Ceph? I > haven't found any benchmarks of this nature beyond I think some people with largish clusters have posted some here. > https://www.mellanox.com/related-docs/whitepapers/WP_Deploying_Ceph_over_High_Performance_Networks.pdf That's a bit dated, but a good start. Another thing to keep in mind is that if you're starting with a clean slate to consider Infiniband, both for even higher speeds, lower latencies and lower prices than Ethernet brand gear. Also Bluestore would help (like everywhere else), but that's one year out and thus not an option. > Which claims 150GB/s? I think perhaps they meant 150Gb/s (150 1Gbps > clients). > Where are you seeing that number? The chart is the only thing that seems to be labeled wrongly. Anyway, as others have mentioned already, don't see a problem with achieving that, but it won't be as cheap/small as a generic 180TB cluster. Simple basic design for one client "machine", scale out as needed: - 40Gb/s+ networking everywhere. Lots of options to go faster (IB) or cheaper (white box switches) here. - enough storage bandwidth for 3GB/s write and read. That's 50MB/s per HDD (a bit lowballed, but not by much) with SSD journals. Bluestore would be about twice as fast here at least. 3000MB/s / 50MB/s = 60 HDDs 60 *2 (read AND write) *3 (replication) = 360 HDDs + 72 journal SSDs. Any HDD at 2TB or larger will satisfy your space requirements. Starting to sound spendy yet? Alternatively, look at a pure SSD/NVMe cluster, for example based on Intel DC S3610 SSDs (1.6TB). 3000MB/s / 250MB/s (read AND write) = 12 SSDs (good node size) 12 *3 (replication, 2 if you feel brave) = 36 SSDs (bandwidth solved). Alas for your space needs, 28 of these 12 SSD nodes are required, 336 SSDs total. Regards, Christian -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Rakuten Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com