Re: Typical 10GbE latency

Wido den Hollander <wido@xxxxxxxx> · Mon, 10 Nov 2014 17:22:04 +0100

On 08-11-14 02:42, Gary M wrote:
> Wido,
> 
> Take the switch out of the path between nodes and remeasure.. ICMP-echo
> requests are very low priority traffic for switches and network stacks. 
> 

I tried with a direct TwinAx and fiber cable. No difference.

> If you really want to know, place a network analyzer between the nodes
> to measure the request packet to response packet latency.. The ICMP
> traffic to the "ping application" is not accurate in the sub-millisecond
> range. And should only be used as a rough estimate.
> 

True, I fully agree with you. But, why is everybody showing a lower
latency here? My latencies are about 40% higher then what I see in this
setup and other setups.

> You also may want to install the high resolution timer patch, sometimes
> called HRT, to the kernel which may give you different results. 
> 
> ICMP traffic takes a different path than the TCP traffic and should not
> be considered an indicator of defect. 
> 

Yes, I'm aware. But it still doesn't explain me why the latency on other
systems, which are in production, is lower then on this idle system.

> I believe the ping app calls the sendto system call.(sorry its been a
> while since I last looked)  Systems calls can take between .1us and .2us
> each. However, the ping application makes several of these calls and
> waits for a signal from the kernel. The wait for a signal means the ping
> application must wait to be rescheduled to report the time.Rescheduling
> will depend on a lot of other factors in the os. eg, timers, card
> interrupts other tasks with higher priorities.  Reporting the time must
> add a few more systems calls for this to happen. As the ping application
> loops to post the next ping request which again requires a few systems
> calls which may cause a task switch while in each system call.
> 
> For the above factors, the ping application is not a good representation
> of network performance due to factors in the application and network
> traffic shaping performed at the switch and the tcp stacks. 
> 

I think that netperf is probably a better tool, but that also does TCP
latencies.

I want the real IP latency, so I assumed that ICMP would be the most
simple one.

The other setups I have access to are in production and do not have any
special tuning, yet their latency is still lower then on this new
deployment.

That's what gets me confused.

Wido

> cheers,
> gary
> 
> 
> On Fri, Nov 7, 2014 at 4:32 PM, Łukasz Jagiełło
> <jagiello.lukasz@xxxxxxxxx <mailto:jagiello.lukasz@xxxxxxxxx>> wrote:
> 
>     Hi,
> 
>     rtt min/avg/max/mdev = 0.070/0.177/0.272/0.049 ms
> 
>     04:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit
>     SFI/SFP+ Network Connection (rev 01)
> 
>     at both hosts and Arista 7050S-64 between.
> 
>     Both hosts were part of active ceph cluster.
> 
> 
>     On Thu, Nov 6, 2014 at 5:18 AM, Wido den Hollander <wido@xxxxxxxx
>     <mailto:wido@xxxxxxxx>> wrote:
> 
>         Hello,
> 
>         While working at a customer I've ran into a 10GbE latency which
>         seems
>         high to me.
> 
>         I have access to a couple of Ceph cluster and I ran a simple
>         ping test:
> 
>         $ ping -s 8192 -c 100 -n <ip>
> 
>         Two results I got:
> 
>         rtt min/avg/max/mdev = 0.080/0.131/0.235/0.039 ms
>         rtt min/avg/max/mdev = 0.128/0.168/0.226/0.023 ms
> 
>         Both these environment are running with Intel 82599ES 10Gbit
>         cards in
>         LACP. One with Extreme Networks switches, the other with Arista.
> 
>         Now, on a environment with Cisco Nexus 3000 and Nexus 7000
>         switches I'm
>         seeing:
> 
>         rtt min/avg/max/mdev = 0.160/0.244/0.298/0.029 ms
> 
>         As you can see, the Cisco Nexus network has high latency
>         compared to the
>         other setup.
> 
>         You would say the switches are to blame, but we also tried with
>         a direct
>         TwinAx connection, but that didn't help.
> 
>         This setup also uses the Intel 82599ES cards, so the cards don't
>         seem to
>         be the problem.
> 
>         The MTU is set to 9000 on all these networks and cards.
> 
>         I was wondering, others with a Ceph cluster running on 10GbE,
>         could you
>         perform a simple network latency test like this? I'd like to
>         compare the
>         results.
> 
>         --
>         Wido den Hollander
>         42on B.V.
>         Ceph trainer and consultant
> 
>         Phone: +31 (0)20 700 9902 <tel:%2B31%20%280%2920%20700%209902>
>         Skype: contact42on
>         _______________________________________________
>         ceph-users mailing list
>         ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> 
>     -- 
>     Łukasz Jagiełło
>     lukasz<at>jagiello<dot>org
> 
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com