Re: Questions about r/w low performance on ceph pacific vs ceph luminous

Paul Mezzanini <pfmeec@xxxxxxx> · Wed, 7 Dec 2022 20:15:34 +0000

I looked into nocache vs direct.  It looks like nocache just requests that the caches be dumped before doing it's operations while direct uses direct IO.  Writes getting cached would make it appear much faster.  Those tests are not apples-to-apples.  

I'm also trying to decode how you did your two tests.  Am I reading it right that your luminos test is a VM in openstack writing through a translation layer to an RBD image and the pacific test is a native host with an RBD mounted on it?   If so, it feels like whatever is inbetween the vm and the RBD is doing some caching.  A good way to see what's going on with this is to watch whatever network you have your replication traffic on while you run your test.   If you start the test and the network traffic lags in starting and then continues on past the end of the test, you have at least one cache in the way.  If your traffic starts and stops with the test, you are most likely not seeing caching artifacts.  

That's where I would start looking in your environment.   

-paul

________________________________________
From: Shai Levi (Nokia) <shai.levi@xxxxxxxxx>
Sent: Wednesday, December 7, 2022 9:35 AM
To: ceph-users@xxxxxxx
Subject:  Questions about r/w low performance on ceph pacific vs ceph luminous

Hey,

We saw that the performance of rbd disk image IOPS over Ceph-Pacific is much slower than the rbd disk image IOPS over Ceph-Luminous.

We performed a simple dd write test with the following results:
**the hardware and the osd layout is the same on both environments

Writing to rbd image (via vm on opesntack) on luminous:
[root@noam-test-storagebm-0 shai]# dd if=/dev/zero of=testfile bs=1M count=1024 oflag=direct
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 2.19544 s, 489 MB/s

Writing to rbd image (pv mounted to k8s pod) on pacific:
[root@noam-test-masterbm-0 noam (Active)]# dd if=/dev/zero of=testfile bs=1M count=1024 oflag=direct
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 5.51966 s, 195 MB/s

As you can see, the performance is almost 2.5 times as fast in luminous than in pacific.

We've noticed that if you change the oflag value from "direct" to "nocache", the speed is 4 times faster (writing to rbd image on pacific):
[root@noam-test-masterbm-0 noam (Active)]# dd if=/dev/zero of=testfile bs=1M count=1024 oflag=nocache
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 1.32078 s, 813 MB/s

We were wondering:

  1.  Why oflag=direct vs oflag=nocache has such a significant difference in speed? (when writing to rbd).
  2.  Why when using oflag=direct on both luminous and pacific has such a big difference in I/O performance? is it related to moving from ceph-disk to ceph-volume?
  3.  Is it related to cache bios/raid controller configuration?
  4.  Is there a documentation page about how to optimize read/write performance both for rbd and cephfs?

Best regards,
Shai

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx