Re: SSD Hardware recommendation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 18 Mar 2015, at 05:29, Christian Balzer <chibi@xxxxxxx> wrote:


Hello,

On Wed, 18 Mar 2015 03:52:22 +0100 Josef Johansson wrote:

Hi,

I’m planning a Ceph SSD cluster, I know that we won’t get the full
performance from the SSD in this case, but SATA won’t cut it as backend
storage and SAS is the same price as SSD now.

Have you actually tested SATA with SSD journals?
Given a big enough (number of OSDs) cluster you should be able to come
close to SSD performance currently achievable with regards to a single
client.

Yeah, 
The problem is really the latency when backing storage is fully utilised, especially while rebalancing data and deep scrubbing.
The MySQL is actually living inside a Journal + SATA backing storage atm, so this is the problem I’m trying to solve.
The backend network will be a 10GbE active/passive, but will be used
mainly for MySQL, so we’re aiming for swallowing IO.

Is this a single MySQL instance or are we talking various VMs here?
If you're flexible in regards to the network, Infiniband will give you
lower latency, especially with the RDMA stuff being developed currently
for Ceph (I'd guess a year or so out).
Because with single (or few) clients, IOPS per client/thread are limited
by the latency of the network (and of course the whole Ceph stack) more
than anything else, so on a single thread you're never going to see
performance anywhere near what a local SSD could deliver.

Going to use 150-200 MySQL clients, one on each VM, so the load should be good for Ceph =)
And sadly I’m in no position to use RDMA etc, as It’s decided with 10Gbase-T.
Really liking the SM servers with 4x 10Gbase-T =)
Thanks for the recommendation though.

So, for 10x SSD drivers, what kind of CPU would that need? Just go all
out with two 10x cores 3.5GHz? I read somewhere that you should use as
fast CPUs that you can afford.

Indeed.
With 10 SSDs even that will probably be CPU bound with small IOPS and
current stable Ceph versions.
See my list archive URL below.

What size SSDs?
Is the number of SSDs a result of needing the space, or is it there to get
more OSDs and thus performance?
Both, performance and space. So 1TB drives (well 960GB in this case)
100GB MySQL for 100VMs. (Calculated on a replication of 3)

Planning on using the Samsung 845 DC EVO, anyone using these in current
ceph clusters?

I'm using them in a DRBD cluster where they were a good fit as their write
endurance was a match for the use case, I needed lots of space (960GB ones)
and the relatively low price was within my budget.

While I'm not tearing out my hairs and curse the day I ever considered
using them, their speed, endurance and some spurious errors I'm not seeing
with Intel DC S3700s in the same server have me considering DC S3610s
instead for the next cluster of this type I'm currently planning.

Compare those Intel DC S3610 and DC S3700 with Samsung 845 DC Pro if you're
building a Ceph cluster, the write amplification I'm seeing with SSD
backed Ceph clusters will turn your EVO's into scrap metal in no time,.

Consider what you think your IO load (writes) generated by your client(s)
will be, multiply that by your replication factor, divide by the number of
OSDs, that will give you the base load per OSD.
Then multiply by 2 (journal on OSD) per OSD.
Finally based on my experience and measurements (link below) multiply that
by at least 6, probably 10 to be on safe side. Use that number to find the
SSD that can handle this write load for the time period you're budgeting
that cluster for.
http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2014-October/043949.html
It feels that I can’t go with anything else than at least S3610. Especially if it’s replication set of 2.
Haven’t done much reading about the S3610 I will go into depth on them.

We though of doing a cluster with 3 servers, and any recommendation of
supermicro servers would be appreciated.

Why 3, replication of 3?
With Intel SSDs and diligent (SMART/NAGIOS) wear level monitoring I'd
personally feel safe with a replication factor of 2.

I’ve seen recommendations  of replication 2!  The Intel SSDs are indeed endurable.
This is only with Intel SSDs I assume?
I used one of these chassis for the DRBD cluster mentioned above, the
version with Infiniband actually:
http://www.supermicro.com.tw/products/system/2U/2028/SYS-2028TP-DC0TR.cfm

It's compact, the LSI can be flashed into IT mode (or demand it in IT mode
from your SM vendor) so all the SSD drives are directly accessible and thus
capable of being (fs)TRIM'ed. Not that this matters much with Intel DCs.

SM also has 1U servers that fit this drive density bill, but compared to
the 2U servers their 1U rails are very dingy (comes with the size I
guess). ^o^
Yeah, IT mode is the way to go. I tried using RAID 0 to utilise the RAID Cache, but then you have problems with not being able to plug in a new drive easily etc.

This 1U http://www.supermicro.com.tw/products/system/1U/1028/SYS-1028U-TR4T_.cfm is really nice, missing the SuperDOM peripherals though.. so you really get 8 drives if you need two for OS.
And the rails.. don’t get me started, but lately they do just snap into the racks! No screws needed. That’s a refresh from earlier 1U SM rails.

Thanks!

Josef

Regards,

Christian

Cheers,
Josef
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


--
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx    Global OnLine Japan/Fusion Communications
http://www.gol.com/

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux