Re: Typical 10GbE latency

Wido den Hollander <wido@xxxxxxxx> · Wed, 12 Nov 2014 12:55:19 +0100

(back to list)

On 11/10/2014 06:57 PM, Gary M wrote:
> Hi Wido,
> 
> That is a bit weird.. I'd also check the Ethernet controller firmware
> version and settings between the other configurations. There must be
> something different.
> 

Indeed, there must be something! But I can't figure it out yet. Same
controllers, tried the same OS, direct cables, but the latency is 40%
higher.

> I can understand wanting to do a simple latency test.. But as we get closer
> to hw speeds and microsecond measurements, measures appear to be more
> unstable through software stacks.
> 

I fully agree with you. But a basic ICMP test on a idle machine should
be a baseline from where you can start with further diagnosing network
latency using better tools like netperf.

Wido

> 
> 
> -gary
> 
> On Mon, Nov 10, 2014 at 9:22 AM, Wido den Hollander <wido@xxxxxxxx> wrote:
> 
>> On 08-11-14 02:42, Gary M wrote:
>>> Wido,
>>>
>>> Take the switch out of the path between nodes and remeasure.. ICMP-echo
>>> requests are very low priority traffic for switches and network stacks.
>>>
>>
>> I tried with a direct TwinAx and fiber cable. No difference.
>>
>>> If you really want to know, place a network analyzer between the nodes
>>> to measure the request packet to response packet latency.. The ICMP
>>> traffic to the "ping application" is not accurate in the sub-millisecond
>>> range. And should only be used as a rough estimate.
>>>
>>
>> True, I fully agree with you. But, why is everybody showing a lower
>> latency here? My latencies are about 40% higher then what I see in this
>> setup and other setups.
>>
>>> You also may want to install the high resolution timer patch, sometimes
>>> called HRT, to the kernel which may give you different results.
>>>
>>> ICMP traffic takes a different path than the TCP traffic and should not
>>> be considered an indicator of defect.
>>>
>>
>> Yes, I'm aware. But it still doesn't explain me why the latency on other
>> systems, which are in production, is lower then on this idle system.
>>
>>> I believe the ping app calls the sendto system call.(sorry its been a
>>> while since I last looked)  Systems calls can take between .1us and .2us
>>> each. However, the ping application makes several of these calls and
>>> waits for a signal from the kernel. The wait for a signal means the ping
>>> application must wait to be rescheduled to report the time.Rescheduling
>>> will depend on a lot of other factors in the os. eg, timers, card
>>> interrupts other tasks with higher priorities.  Reporting the time must
>>> add a few more systems calls for this to happen. As the ping application
>>> loops to post the next ping request which again requires a few systems
>>> calls which may cause a task switch while in each system call.
>>>
>>> For the above factors, the ping application is not a good representation
>>> of network performance due to factors in the application and network
>>> traffic shaping performed at the switch and the tcp stacks.
>>>
>>
>> I think that netperf is probably a better tool, but that also does TCP
>> latencies.
>>
>> I want the real IP latency, so I assumed that ICMP would be the most
>> simple one.
>>
>> The other setups I have access to are in production and do not have any
>> special tuning, yet their latency is still lower then on this new
>> deployment.
>>
>> That's what gets me confused.
>>
>> Wido
>>
>>> cheers,
>>> gary
>>>
>>>
>>> On Fri, Nov 7, 2014 at 4:32 PM, Łukasz Jagiełło
>>> <jagiello.lukasz@xxxxxxxxx <mailto:jagiello.lukasz@xxxxxxxxx>> wrote:
>>>
>>>     Hi,
>>>
>>>     rtt min/avg/max/mdev = 0.070/0.177/0.272/0.049 ms
>>>
>>>     04:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit
>>>     SFI/SFP+ Network Connection (rev 01)
>>>
>>>     at both hosts and Arista 7050S-64 between.
>>>
>>>     Both hosts were part of active ceph cluster.
>>>
>>>
>>>     On Thu, Nov 6, 2014 at 5:18 AM, Wido den Hollander <wido@xxxxxxxx
>>>     <mailto:wido@xxxxxxxx>> wrote:
>>>
>>>         Hello,
>>>
>>>         While working at a customer I've ran into a 10GbE latency which
>>>         seems
>>>         high to me.
>>>
>>>         I have access to a couple of Ceph cluster and I ran a simple
>>>         ping test:
>>>
>>>         $ ping -s 8192 -c 100 -n <ip>
>>>
>>>         Two results I got:
>>>
>>>         rtt min/avg/max/mdev = 0.080/0.131/0.235/0.039 ms
>>>         rtt min/avg/max/mdev = 0.128/0.168/0.226/0.023 ms
>>>
>>>         Both these environment are running with Intel 82599ES 10Gbit
>>>         cards in
>>>         LACP. One with Extreme Networks switches, the other with Arista.
>>>
>>>         Now, on a environment with Cisco Nexus 3000 and Nexus 7000
>>>         switches I'm
>>>         seeing:
>>>
>>>         rtt min/avg/max/mdev = 0.160/0.244/0.298/0.029 ms
>>>
>>>         As you can see, the Cisco Nexus network has high latency
>>>         compared to the
>>>         other setup.
>>>
>>>         You would say the switches are to blame, but we also tried with
>>>         a direct
>>>         TwinAx connection, but that didn't help.
>>>
>>>         This setup also uses the Intel 82599ES cards, so the cards don't
>>>         seem to
>>>         be the problem.
>>>
>>>         The MTU is set to 9000 on all these networks and cards.
>>>
>>>         I was wondering, others with a Ceph cluster running on 10GbE,
>>>         could you
>>>         perform a simple network latency test like this? I'd like to
>>>         compare the
>>>         results.
>>>
>>>         --
>>>         Wido den Hollander
>>>         42on B.V.
>>>         Ceph trainer and consultant
>>>
>>>         Phone: +31 (0)20 700 9902 <tel:%2B31%20%280%2920%20700%209902>
>>>         Skype: contact42on
>>>         _______________________________________________
>>>         ceph-users mailing list
>>>         ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>>>         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>>
>>>
>>>     --
>>>     Łukasz Jagiełło
>>>     lukasz<at>jagiello<dot>org
>>>
>>>     _______________________________________________
>>>     ceph-users mailing list
>>>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>>
>> --
>> Wido den Hollander
>> 42on B.V.
>>
>> Phone: +31 (0)20 700 9902
>> Skype: contact42on
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> 

-- 
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com