Write performance issue under rocksdb kvstore

Z Zhang <zhangz.david@xxxxxxxxxxx> · Tue, 20 Oct 2015 18:51:46 +0800

Hi Guys,

I am trying latest ceph-9.1.0 with rocksdb 4.1 and ceph-9.0.3 with rocksdb 3.11 as OSD backend. I use rbd to test performance and following is my cluster info.

[ceph@xxx ~]$ ceph -s
    cluster b74f3944-d77f-4401-a531-fa5282995808
     health HEALTH_OK
     monmap e1: 1 mons at {xxx=xxx.xxx.xxx.xxx:6789/0}
            election epoch 1, quorum 0 xxx
     osdmap e338: 44 osds: 44 up, 44 in
            flags sortbitwise
      pgmap v1476: 2048 pgs, 1 pools, 158 MB data, 59 objects
            1940 MB used, 81930 GB / 81932 GB avail
                2048 active+clean

All the disks are spinning ones with write cache turning on. Rocksdb's WAL and sst files are on the same disk as every OSD.

Using fio to generate following write load: 
fio -direct=1 -rw=randwrite -ioengine=sync -size=10M -bs=4K -group_reporting -directory /mnt/rbd_test/ -name xxx.1 -numjobs=1  

Test result:
WAL enabled + sync: false + disk write cache: on  will get ~700 IOPS.
WAL enabled + sync: true (default) + disk write cache: on|off  will get only ~25 IOPS.

I tuned some other rocksdb options, but with no lock. I tracked down the rocksdb code and found each writer's Sync operation would take ~30ms to finish. And as shown above, it is strange that performance has no much difference no matters disk write cache is on or off.

Do your guys encounter the similar issue? Or do I miss something to cause rocksdb's poor write performance?

Thanks.
Zhi Zhang (David) 		 	   		  --
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html