Re: Cost- and Powerefficient OSD-Nodes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Tuesday, April 28, 2015, Nick Fisk <nick@xxxxxxxxxx> wrote:




> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
> Dominik Hannen
> Sent: 28 April 2015 15:30
> To: Jake Young
> Cc: ceph-users@xxxxxxxxxxxxxx
> Subject: Re: Cost- and Powerefficient OSD-Nodes
>
> >> Interconnect as currently planned:
> >> 4 x 1Gbit LACP Bonds over a pair of MLAG-capable switches (planned:
> >> EX3300)
>
> > One problem with LACP is that it will only allow you to have 1Gbps
> > between any two IPs or MACs (depending on your switch config). This
> > will most likely limit the throughput of any client to 1Gbps, which is
> > equivalent to 125MBps storage throughput.  It is not really equivalent
> > to a 4Gbps interface or 2x 2Gbps interfaces (if you plan to have a
> > client network and cluster network).
>
> 2 x (2 x 1Gbit) was on my mind with cluster/public separated, if the
> performance of 4 x 1Gbit LACP would not deliver.
> Regarding source-IP/dest-IP hashing with LACP. Wouldn't it be sufficient
to
> give each osd-process its own IP for cluster/public then?

I'm not sure this is supported. It would probably require a custom CRUSH map.  I don't know if a host bucket can support multiple IPs. It is a good idea though, I wish I thought of it last year! 

>
> I am not sure if 4-link LACP will be problematic with enough systems in
the
> cluster. Maybe 8 osd-nodes will not be enough to balance it out.
> It is not important if every client is able to get peak performance out of
it.
>
> > I have implemented a small cluster with no SSD journals, and the
> > performance is pretty good.
> >
> > 42 osds, 3x replication, 40Gb NICs rados bench shows me 2000 iops at
> > 4k writes and 500MBps at 4M writes.
> >
> > I would trade your SSD journals for 10Gb NICs and switches.  I started
> > out with the same 4x 1Gb LACP config and things like
> > rebalancing/recovery were terribly slow, as well as the throughput limit
I
> mentioned above.
>
> The SSDs are about ~100USD a piece. I tried to find cost-efficient 10G-
> switches. There it also the power-efficiency in question, a 10G-T Port
burns
> about 3~5 Watt on its own. Which would put SFP+ Ports (0.7W/Port) on the
> table.

I think the latest switches/Nic's reduce this slightly more if you enable
the power saving options and keep the cable length short.

>
> Can you recommend a 'cheap' 10G-switch/NICs?

I using the Dell N4032's. they seem to do the job and aren't too expensive.
For the server side, we got servers with 10GB-T built in for almost the same
cost at the 4x1GB models.

I'm using a pair of Cisco Nexus 5672UP switches. There are other Nexus 5000 models that are less expensive, but it's pretty affordable for 48 10Gb ports and 6 40Gb uplinks. 

I have Cisco UCS servers that have the Cisco VICs. 


>
> > When you get more funding next quarter/year, you can choose to add the
> > SSD journals or more OSD nodes. Moving to 10Gb networking after you
> > get the cluster up and running will be much harder.
>
> My thinking was that the switches (EX3300) with their 10G uplinks would
> deliver in the case that I would like to add in some 10G switches and
hosts
> later.
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux