I was informed today that the CEPH environment I’ve been working on is no longer available. Unfortunately this happened before I could try any of your suggestions, Roman.
Thank you for all the attention and advice. On 2018-12-19 22:01, Marc Roos wrote:I would be interested learning about the performance increase it has
compared to 10Gbit. I got the ConnectX-3 Pro but I am not using the rdma
because support is not default available.
Not too much, the following is the comparison on latest master usingfio engine, which measures bare ceph messenger performance (no disk IO):https://github.com/ceph/ceph/pull/24678Mellanox MT27710 Family [ConnectX-4 Lx] 25gb/s: bs iodepth=8, async+posix iodepth=8, async+rdma---- --------------------------------- ---------------------------------- 4k IOPS=30.0k BW=121MiB/s 0.257ms IOPS=47.9k BW=187MiB/s 0.166ms 8k IOPS=30.8k BW=240MiB/s 0.259ms IOPS=46.3k BW=362MiB/s 0.172ms 16k IOPS=25.1k BW=392MiB/s 0.318ms IOPS=45.2k BW=706MiB/s 0.176ms 32k IOPS=23.1k BW=722MiB/s 0.345ms IOPS=37.5k BW=1173MiB/s 0.212ms 64k IOPS=18.0k BW=1187MiB/s 0.420ms IOPS=41.0k BW=2624MiB/s 0.189ms128k IOPS=12.1k BW=1518MiB/s 0.657ms IOPS=20.9k BW=2613MiB/s 0.381ms256k IOPS=3530 BW=883MiB/s 2.265ms IOPS=4624 BW=1156MiB/s 1.729ms512k IOPS=2084 BW=1042MiB/s 3.387ms IOPS=2406 BW=1203MiB/s 3.32ms 1m IOPS=1119 BW=1119MiB/s 7.145ms IOPS=1277 BW=1277MiB/s 6.26ms 2m IOPS=551 BW=1101MiB/s 14.51ms IOPS=631 BW=1263MiB/s 12.66ms 4m IOPS=272 BW=1085MiB/s 29.45ms IOPS=318 BW=1268MiB/s 25.17ms bs iodepth=128, async+posix iodepth=128, async+rdma---- --------------------------------- ---------------------------------- 4k IOPS=75.9k BW=297MiB/s 1.683ms IOPS=83.4k BW=326MiB/s 1.535ms 8k IOPS=64.3k BW=502MiB/s 1.989ms IOPS=70.3k BW=549MiB/s 1.819ms 16k IOPS=53.9k BW=841MiB/s 2.376ms IOPS=57.8k BW=903MiB/s 2.214ms 32k IOPS=42.2k BW=1318MiB/s 3.034ms IOPS=59.4k BW=1855MiB/s 2.154ms 64k IOPS=30.0k BW=1934MiB/s 4.135ms IOPS=42.3k BW=2645MiB/s 3.023ms128k IOPS=18.1k BW=2268MiB/s 7.052ms IOPS=21.2k BW=2651MiB/s 6.031ms256k IOPS=5186 BW=1294MiB/s 24.71ms IOPS=5253 BW=1312MiB/s 24.39ms512k IOPS=2897 BW=1444MiB/s 44.19ms IOPS=2944 BW=1469MiB/s 43.48ms 1m IOPS=1306 BW=1297MiB/s 97.98ms IOPS=1421 BW=1415MiB/s 90.27ms 2m IOPS=612 BW=1199MiB/s 208.6ms IOPS=862 BW=1705MiB/s 148.9ms 4m IOPS=316 BW=1235MiB/s 409.1ms IOPS=416 BW=1664MiB/s 307.4ms1. As you can see there is no big difference between posix and rdma.2. Even 25gb/s card is used we barely reach 20gb/s. I have also results on 100gb/s qlogic cards, no difference, because the bottleneck is not a network. This is especially visible on loads with bigger number of iopdeth: bandwidth is not significantly changed. So even you increase number of requests in-flight you reach the limit how fast those requests are processed.3. Keep in mind this is only messenger performance, so on real ceph loads you will get less, because of the whole IO stack involved.--Roman
|
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com