Re: Typical 10GbE latency

Alexandre DERUMIER <aderumier@xxxxxxxxx> · Wed, 12 Nov 2014 09:09:07 +0100 (CET)

>>Is this with a 8192 byte payload?
Oh, sorry it was with 1500.
I'll try to send a report with 8192 tomorrow.

----- Mail original ----- 

De: "Robert LeBlanc" <robert@xxxxxxxxxxxxx> 
À: "Alexandre DERUMIER" <aderumier@xxxxxxxxx> 
Cc: "Wido den Hollander" <wido@xxxxxxxx>, ceph-users@xxxxxxxxxxxxxx 
Envoyé: Mardi 11 Novembre 2014 23:13:17 
Objet: Re:  Typical 10GbE latency 

Is this with a 8192 byte payload? Theoretical transfer time of 1 Gbps (you are only sending one packet so LACP won't help) one direction is 0.061 ms, double that and you are at 0.122 ms of bits in flight, then there is context switching, switch latency (store and forward assumed for 1 Gbps), etc which I'm not sure would fit in the rest of the 0.057 of you min time. If it is a 8192 byte payload, then I'm really impressed! 

On Tue, Nov 11, 2014 at 11:56 AM, Alexandre DERUMIER < aderumier@xxxxxxxxx > wrote: 

Don't have yet 10GBE, but here my result my simple lacp on 2 gigabit links with a cisco 6500 

rtt min/avg/max/mdev = 0.179/0.202/0.221/0.019 ms 

(Seem to be lower than your 10gbe nexus) 

----- Mail original ----- 

De: "Wido den Hollander" < wido@xxxxxxxx > 
À: ceph-users@xxxxxxxxxxxxxx 
Envoyé: Lundi 10 Novembre 2014 17:22:04 
Objet: Re:  Typical 10GbE latency 

On 08-11-14 02:42, Gary M wrote: 
> Wido, 
> 
> Take the switch out of the path between nodes and remeasure.. ICMP-echo 
> requests are very low priority traffic for switches and network stacks. 
> 

I tried with a direct TwinAx and fiber cable. No difference. 

> If you really want to know, place a network analyzer between the nodes 
> to measure the request packet to response packet latency.. The ICMP 
> traffic to the "ping application" is not accurate in the sub-millisecond 
> range. And should only be used as a rough estimate. 
> 

True, I fully agree with you. But, why is everybody showing a lower 
latency here? My latencies are about 40% higher then what I see in this 
setup and other setups. 

> You also may want to install the high resolution timer patch, sometimes 
> called HRT, to the kernel which may give you different results. 
> 
> ICMP traffic takes a different path than the TCP traffic and should not 
> be considered an indicator of defect. 
> 

Yes, I'm aware. But it still doesn't explain me why the latency on other 
systems, which are in production, is lower then on this idle system. 

> I believe the ping app calls the sendto system call.(sorry its been a 
> while since I last looked) Systems calls can take between .1us and .2us 
> each. However, the ping application makes several of these calls and 
> waits for a signal from the kernel. The wait for a signal means the ping 
> application must wait to be rescheduled to report the time.Rescheduling 
> will depend on a lot of other factors in the os. eg, timers, card 
> interrupts other tasks with higher priorities. Reporting the time must 
> add a few more systems calls for this to happen. As the ping application 
> loops to post the next ping request which again requires a few systems 
> calls which may cause a task switch while in each system call. 
> 
> For the above factors, the ping application is not a good representation 
> of network performance due to factors in the application and network 
> traffic shaping performed at the switch and the tcp stacks. 
> 

I think that netperf is probably a better tool, but that also does TCP 
latencies. 

I want the real IP latency, so I assumed that ICMP would be the most 
simple one. 

The other setups I have access to are in production and do not have any 
special tuning, yet their latency is still lower then on this new 
deployment. 

That's what gets me confused. 

Wido 

> cheers, 
> gary 
> 
> 
> On Fri, Nov 7, 2014 at 4:32 PM, Łukasz Jagiełło 
> < jagiello.lukasz@xxxxxxxxx <mailto: jagiello.lukasz@xxxxxxxxx >> wrote: 
> 
> Hi, 
> 
> rtt min/avg/max/mdev = 0.070/0.177/0.272/0.049 ms 
> 
> 04:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit 
> SFI/SFP+ Network Connection (rev 01) 
> 
> at both hosts and Arista 7050S-64 between. 
> 
> Both hosts were part of active ceph cluster. 
> 
> 
> On Thu, Nov 6, 2014 at 5:18 AM, Wido den Hollander < wido@xxxxxxxx 
> <mailto: wido@xxxxxxxx >> wrote: 
> 
> Hello, 
> 
> While working at a customer I've ran into a 10GbE latency which 
> seems 
> high to me. 
> 
> I have access to a couple of Ceph cluster and I ran a simple 
> ping test: 
> 
> $ ping -s 8192 -c 100 -n <ip> 
> 
> Two results I got: 
> 
> rtt min/avg/max/mdev = 0.080/0.131/0.235/0.039 ms 
> rtt min/avg/max/mdev = 0.128/0.168/0.226/0.023 ms 
> 
> Both these environment are running with Intel 82599ES 10Gbit 
> cards in 
> LACP. One with Extreme Networks switches, the other with Arista. 
> 
> Now, on a environment with Cisco Nexus 3000 and Nexus 7000 
> switches I'm 
> seeing: 
> 
> rtt min/avg/max/mdev = 0.160/0.244/0.298/0.029 ms 
> 
> As you can see, the Cisco Nexus network has high latency 
> compared to the 
> other setup. 
> 
> You would say the switches are to blame, but we also tried with 
> a direct 
> TwinAx connection, but that didn't help. 
> 
> This setup also uses the Intel 82599ES cards, so the cards don't 
> seem to 
> be the problem. 
> 
> The MTU is set to 9000 on all these networks and cards. 
> 
> I was wondering, others with a Ceph cluster running on 10GbE, 
> could you 
> perform a simple network latency test like this? I'd like to 
> compare the 
> results. 
> 
> -- 
> Wido den Hollander 
> 42on B.V. 
> Ceph trainer and consultant 
> 
> Phone: +31 (0)20 700 9902 <tel:%2B31%20%280%2920%20700%209902> 
> Skype: contact42on 
> _______________________________________________ 
> ceph-users mailing list 
> ceph-users@xxxxxxxxxxxxxx <mailto: ceph-users@xxxxxxxxxxxxxx > 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> 
> 
> 
> 
> -- 
> Łukasz Jagiełło 
> lukasz<at>jagiello<dot>org 
> 
> _______________________________________________ 
> ceph-users mailing list 
> ceph-users@xxxxxxxxxxxxxx <mailto: ceph-users@xxxxxxxxxxxxxx > 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> 
> 
> 
> 
> _______________________________________________ 
> ceph-users mailing list 
> ceph-users@xxxxxxxxxxxxxx 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> 

-- 
Wido den Hollander 
42on B.V. 

Phone: +31 (0)20 700 9902 
Skype: contact42on 
_______________________________________________ 
ceph-users mailing list 
ceph-users@xxxxxxxxxxxxxx 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
_______________________________________________ 
ceph-users mailing list 
ceph-users@xxxxxxxxxxxxxx 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com