Re: Cost- and Powerefficient OSD-Nodes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



FWIW, I tried using some 256G MX100s with ceph and had horrible performance issues within a month or two.  I was seeing 100% utilization with high latency but only 20 MB/s writes.  I had a number of S3500s in the same pool that were dramatically better.  Which is to say that they were actually faster than the hard disk pool they were fronting, rather than slower.

If you do go with MX200s, I'd recommend only using at most 80% of the drive; most cheap SSDs perform *much* better at sustained writes if you give them more overprovisioning space to work with.


Scott


On Tue, Apr 28, 2015, 4:30 PM Dominik Hannen <hannen@xxxxxxxxx> wrote:
> It's all about the total latency per operation. Most IO sizes over 10GB
> don't make much difference to the Round Trip Time. But comparatively even
> 128KB IO's over 1GB take quite a while. For example ping a host with a
> payload of 64k over 1GB and 10GB networks and look at the difference in
> times. Now double this for Ceph (Client->Prim OSD->Sec OSD)
>
> When you are using SSD journals you normally end up with write latency of
> 3-4ms over 10GB, 1GB networking will probably increase this by another
> 2-4ms. IOPs=1000/latency
>
> I guess it all really depends on how important performance is

I recon we are talking about single-threaded IOPs? It looks like 10ms latency
is in the worst-case region.. 100 IOPs will do fine.

At least in my understanding heavily multi-threaded load should be able to
get higher IOPs regardless of latency?

Some presentation material suggested that the adverse effects of higher
latency, due to 1Gbit, begin above IO sizes of 2k, maybe there is room to
tune IOPs hungry applications/vms accordingly.

> Just had a look and the Seagate Surveillance disks spin at 7200RPM (missed
> that you put that there), whereas the WD ones that I am familiar with spin
> at 5400rpm, so not as bad as I thought.
>
> So probably ok to use, but I don't see many people using them for Ceph/
> generic NAS so can't be sure there's no hidden gotchas.

I am not sure how trustworthy newegg-reviews are, but somehow I get some
doubts about them now.
I guess it does not matter that much, at least if not more than a disk a month
is failing? The 3-year warranty gives some hope..

Are there some cost-efficient HDDs that someone can suggest? (Most likely 3TB
drives, that seems to be the sweet-spot at the moment.)

> Sorry nothing in detail, I did actually build a ceph cluster on the same 8
> core CPU as you have listed. I didn't have any performance problems but I do
> remember with SSD journals when doing high queue depth writes I could get
> the CPU quite high. It's like what I said before about the 1vs10Gb
> networking, how important is performance, If using this CPU gives you an
> extra 1ms of latency per OSD, is that acceptable?
>
> Agree 12cores (guessing 2.5Ghz each) will be an overkill for just 12 OSDs. I
> have a very similar spec and see exactly the same as you, but will change
> the nodes to 1CPU each when I expand and use the spare CPU's for the new
> nodes.
>
> I'm using this:-
>
> http://www.supermicro.nl/products/system/4U/F617/SYS-F617H6-FTPTL_.cfm
>
> Mainly because of rack density, which I know doesn't apply to you. But the
> fact they share PSU's/Rails/Chassis helps reduce power a bit and drives down
> cost
>
> I can get 14 disks in each and they have 10GB on board. The SAS controller
> is flashable to JBOD mode.
>
> Maybe one of the other Twin solutions might be suitable?

I did consider that exact model (It was mentioned on the list some time ago)
I could get about the same effective storage-capacity with it, but
10G-Networking is just too expensive on the Switch-side.

Also those nodes and 10G-Switches consume a lot more power.

By my estimates and numbers I found, the Avoton-Nodes should run at about 55W
each. The Switches (EX3300) according to tech-specs would need 76W at max each.

___
Dominik
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux