Re: Returning to the performance in a small cluster topic

vitalif@xxxxxxxxxx · Mon, 29 Jul 2019 17:00:28 +0300

Your results are okay..ish. General rule is that it's hard to achieve 
read latencies below 0.5ms and write latencies below 1ms with Ceph, **no 
matter what drives or network you use**. 10000 iops with one thread is 
0.1 ms. It's just impossible with Ceph currently.

I've heard that some people manage to achieve 0.5ms write or even 
slightly less, but only with very fast CPUs. And 0.5ms is still only 
2000 iops.

And any DBMS needs sync writes because of the journal, so yes, in Ceph 
your peak write TPS is limited by the latency.

I maintain an article on this topic in my wiki:

* https://yourcmc.ru/wiki/Ceph_performance (english)
* https://yourcmc.ru/wiki/Производительность_Ceph (russian)

Dear colleagues,

  I would like to ask you for help with a performance problem on a
site backed with ceph storage backend. Cluster details below.

  I've got a big problem with PostgreSQL performance. It runs inside a
VM with virtio-scsi ceph rbd image. And I see constant ~100% disk load
with up to hundreds milliseconds latencies (via atop) even when pg_top
shows 10-20 tps. All other resources are almost untouched - there is a
lot of memory and free CPU cores, DB fits memory but still has
performance issues.

  The cluster itself:
  nautilus
  6 nodes, 7 SSD with 2 OSDs per SSD (14 OSDs in overall).
  Each node: 2x Intel Xeon E5-2665 v1 (governor = performance,
powersaving disabled), 64GB RAM, Samsung SM863 1.92TB SSD, QDR
Infiniband.

  I've made fio benchmarking with three type of measures:
  a VM with virtio-scsi driver,
  baremetal host with mounted rbd image
  and the same baremetal host with mounted lvm partition on SM863 SSD
drive.

  I've set bs=8k (as Postgres writes 8k blocks) and tried 1 and 8
jobs.

  Here are some results: https://pastebin.com/TFUg5fqA
  Drives load on the OSD hosts are very low, just a few percent.

  Here is my ceph config: https://pastebin.com/X5ZwaUrF

  Numbers don't look very good from my point of view but they are also
not really bad (are they?). But I don't really know the next direction
I can go to solve the problem with PostgreSQL.

  I've tried to make an RAID0 with mdraid and 2 virtual drives but
haven't noticed any difference.

  Could you please tell me:
  Are these performance numbers good or bad according to the hardware?
  Is it possible to tune anything more? May be you can point me to
docs or other papers?
  Does any special VM tuning for the PostgreSQL\ceph cooperation
exist?
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com