Ceph performance problems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

we are currently testing out ways to increase Ceph performance because what we experience so far is very close to unusable.

For the test cluster we utilizing 4 nodes with the following hardware data:

Dual 200GBe Mellanox Ethernet
2x EPYC Rome 7302
16x 32GB 3200MHz ECC
9x 15.36TB Micron 9300 Pro

For production this will be extended to all 8 nodes if it shows promising results.

- Ceph was installed with cephadm.
- MDS and ODS are located on the same nodes.
- Mostly using stock config

- Network performance tested with iperf3 seems fine, 26Gbits/s with -P4 on single port (details below).
   Close to 200Gbits with 10 parallel instances and servers.

When testing a mounted CephFS on the working nodes in various configurations I only got <50MB/s for fuse mount and <270MB/s for kernel mounts. (dd command and output attached below) In addition ceph dashboard and our graphana monitoring reports packet loss on all relevant interfaces during load. Which does not occur during the normal iperf load tests or rsync/scp file transfer.

Rados Bench shows performance around 2000MB/s which is not max performance of the SSDs but fine for us (details below).


Why is the filesystem so slow compared to the individual components?

Cheers
Dominik




Test details:

------------------------------------------------------------------------------------------------------

Some tests done on working nodes:

Ceph mounted with ceph-fuse

root@ml2ran10:/mnt/cephfs/backup# dd if=/dev/zero of=testfile bs=1M count=4096 oflag=direct
4096+0 records in
4096+0 records out
4294967296 bytes (4,3 GB, 4,0 GiB) copied, 88,2933 s, 48,6 MB/s


Ceph mounted with kernel driver:

root@ml2ran06:/mnt/cephfs/backup# dd if=/dev/zero of=testfile bs=1M count=4096 oflag=direct
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 16.0989 s, 267 MB/s


Storage Node

With fuse

root@ml2rsn05:/mnt/ml2r_storage/backup# dd if=/dev/zero of=testfile bs=1M count=4096 oflag=direct
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 53.9977 s, 79.5 MB/s

Kernel mount:

dd if=/dev/zero of=testfile bs=1M count=4096 oflag=direct
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 17.6726 s, 243 MB/s

_______________________________________________________

Iperf3

iperf3 --zerocopy  -n 10240M -P4 -c ml2ran08s0 -p 4701 -i 15 -b 200000000000
Connecting to host ml2ran08s0, port 4701
[  5] local 129.217.31.180 port 43958 connected to 129.217.31.218 port 4701
[  7] local 129.217.31.180 port 43960 connected to 129.217.31.218 port 4701
[  9] local 129.217.31.180 port 43962 connected to 129.217.31.218 port 4701
[ 11] local 129.217.31.180 port 43964 connected to 129.217.31.218 port 4701
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-3.21   sec  2.50 GBytes  6.69 Gbits/sec    0    632 KBytes
[  7]   0.00-3.21   sec  2.50 GBytes  6.70 Gbits/sec    0    522 KBytes
[  9]   0.00-3.21   sec  2.50 GBytes  6.69 Gbits/sec    0    612 KBytes
[ 11]   0.00-3.21   sec  2.50 GBytes  6.69 Gbits/sec    0    430 KBytes
[SUM]   0.00-3.21   sec  10.0 GBytes  26.8 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-3.21   sec  2.50 GBytes  6.69 Gbits/sec 0             sender
[  5]   0.00-3.21   sec  2.50 GBytes  6.67 Gbits/sec                  receiver
[  7]   0.00-3.21   sec  2.50 GBytes  6.70 Gbits/sec 0             sender
[  7]   0.00-3.21   sec  2.50 GBytes  6.67 Gbits/sec                  receiver
[  9]   0.00-3.21   sec  2.50 GBytes  6.69 Gbits/sec 0             sender
[  9]   0.00-3.21   sec  2.49 GBytes  6.67 Gbits/sec                  receiver
[ 11]   0.00-3.21   sec  2.50 GBytes  6.69 Gbits/sec 0             sender
[ 11]   0.00-3.21   sec  2.50 GBytes  6.67 Gbits/sec                  receiver
[SUM]   0.00-3.21   sec  10.0 GBytes  26.8 Gbits/sec 0             sender
[SUM]   0.00-3.21   sec  9.98 GBytes  26.7 Gbits/sec                  receiver



_________________________________________________________________

Rados Bench on storage node:

# rados bench -p testbench 10 write --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_ml2rsn05_2829244
  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s) avg lat(s)
    0       0         0         0         0         0 -           0
    1      16       747       731   2923.84      2924 0.0178757   0.0216262
    2      16      1506      1490   2979.71      3036 0.0308664   0.0213685
    3      16      2267      2251   3000.99      3044 0.0259053   0.0212556
    4      16      3058      3042   3041.62      3164 0.0227621   0.0209792
    5      16      3850      3834    3066.8      3168 0.0130519   0.0208148
    6      16      4625      4609   3072.26      3100 0.151371   0.0207904
    7      16      5381      5365   3065.28      3024 0.0300368   0.0208345
    8      16      6172      6156   3077.57      3164 0.0197728   0.0207714
    9      16      6971      6955   3090.67      3196 0.0142751   0.0206786
   10      14      7772      7758   3102.76      3212 0.0181034    0.020605
Total time run:         10.0179
Total writes made:      7772
Write size:             4194304
Object size:            4194304
Bandwidth (MB/sec):     3103.23
Stddev Bandwidth:       93.3676
Max bandwidth (MB/sec): 3212
Min bandwidth (MB/sec): 2924
Average IOPS:           775
Stddev IOPS:            23.3419
Max IOPS:               803
Min IOPS:               731
Average Latency(s):     0.020598
Stddev Latency(s):      0.00731743
Max latency(s):         0.151371
Min latency(s):         0.00966991



# rados bench -p testbench 10 seq
hints = 1
  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s) avg lat(s)
    0       0         0         0         0         0 -           0
    1      15       657       642   2567.32      2568 0.011104    0.022631
    2      15      1244      1229    2456.9      2348 0.0115248    0.019485
    3      16      1499      1483   1976.64      1016 0.00722983   0.0177887
    4      15      1922      1907   1906.16      1696 0.0142242   0.0327382
    5      15      2593      2578    2061.6      2684 0.011758   0.0301774
    6      16      3142      3126   2083.23      2192 0.00915926    0.027478     7      16      3276      3260   1862.23       536 0.00824714   0.0267449
    8      16      3606      3590   1794.43      1320 0.0118938   0.0350541
    9      16      4293      4277   1900.32      2748 0.0301886   0.0330604
   10      14      5003      4989   1995.04      2848 0.0389717   0.0314977
Total time run:       10.0227
Total reads made:     5003
Read size:            4194304
Object size:          4194304
Bandwidth (MB/sec):   1996.67
Average IOPS:         499
Stddev IOPS:          202.3
Max IOPS:             712
Min IOPS:             134
Average Latency(s):   0.0314843
Max latency(s):       3.04463
Min latency(s):       0.00551523


# rados bench -p testbench 10 rand
hints = 1
  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s) avg lat(s)
    0      15        15         0         0         0 -           0
    1      15       680       665   2657.61      2660 0.00919807   0.0224833     2      15      1273      1258   2514.26      2372 0.00839656   0.0247125     3      16      1863      1847    2461.4      2356 0.00994467   0.0236565     4      16      2064      2048   2047.14       804 0.00809139   0.0223506
    5      16      2064      2048   1637.79         0 -   0.0223506
    6      16      2477      2461   1640.12       826 0.0286315   0.0383254
    7      16      3102      3086   1762.89      2500 0.0267464   0.0349189
    8      16      3513      3497      1748      1644 0.00890952    0.032269     9      16      3617      3601      1600       416 0.00626917   0.0316019
   10      15      4014      3999   1599.18      1592 0.0461076   0.0393606
Total time run:       10.0481
Total reads made:     4014
Read size:            4194304
Object size:          4194304
Bandwidth (MB/sec):   1597.91
Average IOPS:         399
Stddev IOPS:          239.089
Max IOPS:             665
Min IOPS:             0
Average Latency(s):   0.0394035
Max latency(s):       3.00962
Min latency(s):       0.00449537
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux