how to find the lazy egg - poor performance - interesting observations [klartext]

"Stefan Bauer" <sb@xxxxxxx> · Thu, 7 Nov 2019 17:54:50 +0000

Title: how to find the lazy egg - poor performance - interesting observations [klartext]

Hi folks,

we are running a 3 node proxmox-cluster with - of corse - ceph :)

ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable)

10G network. iperf reports almost 10G between all nodes.

We are using mixed standard SSDs (crucial / samsung). We are aware, that these disks can not delivery high iops or great throughput, but we have several of these clusters and this one is showing very poor performance.

NOW the strange fact:

When a specific node is rebooting, the throughput is acceptable.

But when the specific node is back, the results dropped by almost 100%.

2 NODES (one rebooting)

# rados bench -p scbench 10 write --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_pve3_1767693
  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
    0       0         0         0         0         0           -           0
    1      16        55        39   155.992       156   0.0445665    0.257988
    2      16       110        94    187.98       220    0.087097    0.291173
    3      16       156       140   186.645       184    0.462171    0.286895
    4      16       184       168    167.98       112   0.0235336    0.358085
    5      16       210       194   155.181       104    0.112401    0.347883
    6      16       252       236   157.314       168    0.134099    0.382159
    7      16       287       271   154.838       140   0.0264864     0.40092
    8      16       329       313   156.481       168   0.0609964    0.394753
    9      16       364       348   154.649       140    0.244309    0.392331
   10      16       416       400   159.981       208    0.277489    0.387424
Total time run:         10.335496
Total writes made:      417
Write size:             4194304
Object size:            4194304
Bandwidth (MB/sec):     161.386
Stddev Bandwidth:       37.8065
Max bandwidth (MB/sec): 220
Min bandwidth (MB/sec): 104
Average IOPS:           40
Stddev IOPS:            9
Max IOPS:               55
Min IOPS:               26
Average Latency(s):     0.396434
Stddev Latency(s):      0.428527
Max latency(s):         1.86968
Min latency(s):         0.020558

THIRD NODE ONLINE:

root@pve3:/# rados bench -p scbench 10 write --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_pve3_1771977
  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
    0       0         0         0         0         0           -           0
    1      16        39        23   91.9943        92     0.21353    0.267249
    2      16        46        30   59.9924        28     0.29527    0.268672
    3      16        53        37   49.3271        28    0.122732    0.259731
    4      16        53        37   36.9954         0           -    0.259731
    5      16        53        37   29.5963         0           -    0.259731
    6      16        87        71   47.3271   45.3333    0.241921     1.19831
    7      16       106        90   51.4214        76    0.124821     1.07941
    8      16       129       113    56.492        92   0.0314146    0.941378
    9      16       142       126   55.9919        52    0.285536    0.871445
   10      16       147       131   52.3925        20    0.354803    0.852074
Total time run:         10.138312
Total writes made:      148
Write size:             4194304
Object size:            4194304
Bandwidth (MB/sec):     58.3924
Stddev Bandwidth:       34.405
Max bandwidth (MB/sec): 92
Min bandwidth (MB/sec): 0
Average IOPS:           14
Stddev IOPS:            8
Max IOPS:               23
Min IOPS:               0
Average Latency(s):     1.08818
Stddev Latency(s):      1.55967
Max latency(s):         5.02514
Min latency(s):         0.0255947

Is here a single node faulty?

root@pve3:/# ceph status
  cluster:
    id:     138c857a-c4e6-4600-9320-9567011470d6
    health: HEALTH_WARN
            application not enabled on 1 pool(s) (thats just for benchmarking)

  services:
    mon: 3 daemons, quorum pve1,pve2,pve3
    mgr: pve1(active), standbys: pve3, pve2
    osd: 12 osds: 12 up, 12 in

  data:
    pools:   2 pools, 612 pgs
    objects: 758.52k objects, 2.89TiB
    usage:   8.62TiB used, 7.75TiB / 16.4TiB avail
    pgs:     611 active+clean
             1   active+clean+scrubbing+deep

  io:
    client:   4.99MiB/s rd, 1.36MiB/s wr, 678op/s rd, 105op/s wr

Thank you.

Stefan

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com