Re: What is the should be the expected latency of 10Gbit network connections

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2018-01-23 08:27, Blair Bethwaite wrote:

Firstly, the OP's premise in asking, "Or should there be a differnce
of 10x", is fundamentally incorrect. Greater bandwidth does not mean
lower latency, though the latter almost always results in the former.
Unfortunately, changing the speed of light remains a difficult
engineering challenge :-). However, you can do things like: add
multiple links, overlap signals on the wire, and tweak error
correction encodings; all to get more bits on the wire without making
the wire itself any faster. Take Mellanox 100Gb ethernet, 1 lane is
25Gb, to get 50Gb they mash 2 lanes together, to get 100Gb they mash 4
lanes - the latency of single bit transmission is more-or-less
unchanged. Also note that with UDP/TCP pings or actual Ceph traffic
we're going via the kernel stack running on the CPU and as such the
speed & power-management of the CPU can make quite a difference.

Example 25GE on a dual-port CX-4 card in LACP bond, RHEL7 host.

$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.3 (Maipo)
$ ofed_info | head -1
MLNX_OFED_LINUX-4.0-1.0.1.0 (OFED-4.0-1.0.1):
$ grep 'model name' /proc/cpuinfo | uniq
model name      : Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
$ ibv_devinfo
hca_id: mlx5_1
        transport:                      InfiniBand (0)
        fw_ver:                         14.18.1000
        node_guid:                      ...
        sys_image_guid:                 ...
        vendor_id:                      0x02c9
        vendor_part_id:                 4117
        hw_ver:                         0x0
        board_id:                       MT_2420110034
...


$ sudo ping -M do -s 8972 -c 100000 -f ...
100000 packets transmitted, 100000 received, 0% packet loss, time 4652ms
rtt min/avg/max/mdev = 0.029/0.031/2.711/0.015 ms, ipg/ewma 0.046/0.031 ms

$ sudo ping -M do -s 3972 -c 100000 -f ...
100000 packets transmitted, 100000 received, 0% packet loss, time 3321ms
rtt min/avg/max/mdev = 0.019/0.022/0.364/0.003 ms, ipg/ewma 0.033/0.022 ms

$ sudo ping -M do -s 1972 -c 100000 -f ...
100000 packets transmitted, 100000 received, 0% packet loss, time 2818ms
rtt min/avg/max/mdev = 0.017/0.018/0.086/0.005 ms, ipg/ewma 0.028/0.021 ms

$ sudo ping -M do -s 472 -c 100000 -f ...
100000 packets transmitted, 100000 received, 0% packet loss, time 2498ms
rtt min/avg/max/mdev = 0.014/0.016/0.305/0.005 ms, ipg/ewma 0.024/0.017 ms

$ sudo ping -M do -c 100000 -f ...
100000 packets transmitted, 100000 received, 0% packet loss, time 2363ms
rtt min/avg/max/mdev = 0.014/0.015/0.322/0.006 ms, ipg/ewma 0.023/0.016 ms

On 22 January 2018 at 22:37, Nick Fisk <nick@xxxxxxxxxx> wrote:
Anyone with 25G ethernet willing to do the test? Would love to see what the
latency figures are for that.



From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
Maged Mokhtar
Sent: 22 January 2018 11:28
To: ceph-users@xxxxxxxxxxxxxx
Subject: Re: What is the should be the expected latency of
10Gbit network connections



On 2018-01-22 08:39, Wido den Hollander wrote:



On 01/20/2018 02:02 PM, Marc Roos wrote:

  If I test my connections with sockperf via a 1Gbit switch I get around
25usec, when I test the 10Gbit connection via the switch I have around
12usec is that normal? Or should there be a differnce of 10x.


No, that's normal.

Tests with 8k ping packets over different links I did:

1GbE:  0.800ms
10GbE: 0.200ms
40GbE: 0.150ms

Wido


sockperf ping-pong

sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=10.100 sec; SentMessages=432875;
ReceivedMessages=432874
sockperf: ========= Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=10.000 sec; SentMessages=428640;
ReceivedMessages=428640
sockperf: ====> avg-lat= 11.609 (std-dev=1.684)
sockperf: # dropped messages = 0; # duplicated messages = 0; #
out-of-order messages = 0
sockperf: Summary: Latency is 11.609 usec
sockperf: Total 428640 observations; each percentile contains 4286.40
observations
sockperf: ---> <MAX> observation =  856.944
sockperf: ---> percentile  99.99 =   39.789
sockperf: ---> percentile  99.90 =   20.550
sockperf: ---> percentile  99.50 =   17.094
sockperf: ---> percentile  99.00 =   15.578
sockperf: ---> percentile  95.00 =   12.838
sockperf: ---> percentile  90.00 =   12.299
sockperf: ---> percentile  75.00 =   11.844
sockperf: ---> percentile  50.00 =   11.409
sockperf: ---> percentile  25.00 =   11.124
sockperf: ---> <MIN> observation =    8.888

sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=1.100 sec; SentMessages=22065;
ReceivedMessages=22064
sockperf: ========= Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=1.000 sec; SentMessages=20056;
ReceivedMessages=20056
sockperf: ====> avg-lat= 24.861 (std-dev=1.774)
sockperf: # dropped messages = 0; # duplicated messages = 0; #
out-of-order messages = 0
sockperf: Summary: Latency is 24.861 usec
sockperf: Total 20056 observations; each percentile contains 200.56
observations
sockperf: ---> <MAX> observation =   77.158
sockperf: ---> percentile  99.99 =   54.285
sockperf: ---> percentile  99.90 =   37.864
sockperf: ---> percentile  99.50 =   34.406
sockperf: ---> percentile  99.00 =   33.337
sockperf: ---> percentile  95.00 =   27.497
sockperf: ---> percentile  90.00 =   26.072
sockperf: ---> percentile  75.00 =   24.618
sockperf: ---> percentile  50.00 =   24.443
sockperf: ---> percentile  25.00 =   24.361
sockperf: ---> <MIN> observation =   16.746
[root@c01 sbin]# sockperf ping-pong -i 192.168.0.12 -p 5001 -t 10
sockperf: == version #2.6 ==
sockperf[CLIENT] send on:sockperf: using recvfrom() to block on
socket(s)








_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

I find the ping command with flood option handy to measure latency, gives
stats min/max/average/std deviation

example:

ping  -c 100000 -f 10.0.1.12

Maged


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



The ip flood test will show hardware link level latency. sockperf will show latency user space tcp socket applications will see due to kernel context switches, interrupts, transmission buffers, tcp ack..etc So:
sockperf is a better latency measurement to what Ceph clients will see.
The flood latency gives a better picture of expected iops which is the inverse of latency at the link level.( at the app level with concurrency, iops is not related to latency )

Maybe with SPDK/RDMA, Ceph latency will be close to link latency.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux