Re: Switches and latency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Gandalf Corvotempesta [mailto:gandalf.corvotempesta@xxxxxxxxx]
> Sent: 15 June 2016 21:33
> To: nick@xxxxxxxxxx
> Cc: ceph-users@xxxxxxxx
> Subject: Re:  Switches and latency
> 
> 2016-06-15 22:13 GMT+02:00 Nick Fisk <nick@xxxxxxxxxx>:
> > I would reconsider if you need separate switches for each network,
> > vlans would normally be sufficient. If bandwidth is not an issue, you
> > could even tag both vlans over the same uplinks. Then there is the
> > discussion around whether separate networks are really essential....
> 
> Are you suggesting to use the same switch port for both public and private
> network by using vlans? This will slow down everything, as the same port is
> used for both replication and public access.

Possibly, but by how much? 20GB of bandwidth is a lot to feed 12x7.2k disks, particularly if they start doing any sort of non-sequential IO. 

> 
> What I can do is buying 2 switches with 24 ports and using, at the moment,
> port 1 to 12 for public (vlan100) and port 13 to 24 for private (vlan200)
> 
> When I'll have to grow the cluster with more than 12 OSDs servers or more
> than
> 12 "frontend" servers, i'll buy 2 switches more and move all cabling to the
> newer ones.
> 
> Even better, to keep cost low: 2 12 ports switches, 6 used as front, 6 used as
> cluster network.
> Will allow me to use 6 OSDs servers (288TB raw, by using 12 4TB disks on each
> server) and
> 6 hypervisors servers to access the cluster.
> When I have to grow, i'll change 2 switches with bigger ones.
> 
> (side question: which switch should I change? The cluster one or the public
> one  ? Changing the cluster one would trigger an healing during the cabling
> switch as ceph will loose 1 OSD server for a couple of seconds, right?
> Changing the  frontend one will trigger a VMs migration as the whole node
> loose the storage access or just a temporary I/O freeze?)

I think you want to try and keep it simple as possible and make the right decision 1st time round. Buy a TOR switch that will accommodate the number of servers you wish to put in your rack and you should never have a need to change it. 

I think there are issues when one of networks is down and not the other, so stick to keeping each server terminating into the same switch for all its connections, otherwise you are just inviting trouble to happen.

> 
> > Very true and is one for the reasons we switched to it, the other
> > being I was fed up having to solve the "this cable doesn't work with
> > this switch/NIC" challenge. Why cables need eeprom to say which
> > devices they will work with is lost on me!!!
> 
> Twinax cables aren't standard and could not work with my switches?
> if so, 10BaseT for the rest of my life!

Yeah and its worse if you want to connect too different manufacturers kit as you sometimes even need a bespoke cable that has the right vendor matched on either end. I think some vendors are better than others, but I just got fed up with it and liked the fact that with 10G-t it just works.

> > On the latency front I wouldn't be too concerned. 10GB-T has about 2us
> > more latency per hop than SFP. Lowest latency's commonly seen in Ceph
> > are around 500-1000us for reads and 2000us for writes. So unless you
> > are trying to get every last 0.01 of a percent, I don't think you will
> > notice. It might be wise to link the switches together though with SFP
> > or 40G, so the higher latency only effects the last hop to the host
> > and will put you in a better place if/when you need to scale your network
> out.
> 
> My network is very flat. I'll have 2 hop maximum:
> 
> OSD server -> cluster switch (TOR) -> spine switch -> cluster switch
> (TOR) -> OSD server
> This only in case of multiple racks. In a single rack i'll have just 1 hop between
> OSD server and the cluster switch Top Of Rack.
> 
> I can aggregate links between TOR and Spine by using 4x 10GBaseT ports. I
> don't have any 10/40GB switch and would be too expensive.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux