Hi Mark/Sage, Please find the different comparison data in the following document. https://drive.google.com/file/d/0B7W-S0z_ymMJUXVmOUhINU01c3c/view?usp=sharing Please download the doc (and open in xls) as google is not able to show the graphs properly. Setup: ------ Single osd on 700G nvme drive and single osd on 2 700G nvme drive (LVM ed) 48 core server, 40G link Test is only for 4K RW from fio. 1. The first sheet is showing iops and cpu utilization for Bluestor + rocks , Bluestore + Zs and filestore. This is with small shards and with the hack we are using for preconditioning. Bluestore + rocks with 16K min_alloc and ZS with 4k min_alloc WE can see Bluestore with rocks and ZS is behaving almost similarly for a 600G image and it is ~2X higher than filestore. ZS cpu utilization and WA (data not there in xls) is higher than rocks. 2. Next, I created 3 LVM volumes (data/db/wal) out of 2 NVMe drives and created an image of 1TB. See in the next sheet how bluestore + rocks performance came down. Didn't have time for the filestore data but expectation is it will remain similar to previous sheet. Now, ZS is running with 16k min_alloc size here with the prototype shim implementation I was talking about in the standup. This is with this implementation is not fully crash safe but expectation is when we will be done with implementing this write ahead log implementation in ZS it should produce similar throughput. This is giving ~90% benefir over rocks and ZS with 4k min_alloc (like previous sheet) is giving ~50% benefit (not plotted here). Cpu util is similar to rocks. 3. This sheet is to demonstrate the benefit of single kv sync vs multi kv sync with rocks. With ZS we *need* multi kv but with rocks as you can see we are gaining (~20%) only during the peak performance. Later db is getting in the way. I think peak performance is limited to day by osd upstream , if in future we can optimize that and allow more traffic to come in the Bluestore (objectstore) , the benefit of multiple sharded kv sync will be more. Here is the pull request for this.. https://github.com/ceph/ceph/pull/13037 Thanks & Regards Somnath ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html