Nice job Haomai! –––– Sébastien Han Cloud Engineer "Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien.han@xxxxxxxxxxxx Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 25 Nov 2013, at 02:50, Haomai Wang <haomaiwang@xxxxxxxxx> wrote: > > > > On Mon, Nov 25, 2013 at 2:17 AM, Mark Nelson <mark.nelson@xxxxxxxxxxx> wrote: > Great Work! This is very exciting! Did you happen to try RADOS bench at different object sizes and concurrency levels? > > > Maybe can try it later. :-) > > Mark > > > On 11/24/2013 03:01 AM, Haomai Wang wrote: > Hi all, > > For Emperor > blueprint(http://wiki.ceph.com/01Planning/02Blueprints/Emperor/Add_LevelDB_support_to_ceph_cluster_backend_store), > I'm sorry to delay the progress. Now, I have done the most of the works > for the blueprint's goal. Because of sage's F > blueprint(http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Firefly/osd:_new_key%2F%2Fvalue_backend), > I need to adjust some codes to match it. The branch is > here(https://github.com/yuyuyu101/ceph/tree/wip/6173). > > I have tested the LevelDB backend on three nodes(eight OSDs) and compare > it to FileStore(ext4). I just use intern benchmark tool "rados bench" to > get the comparison. The default ceph configurations is used and > replication size is 2. The filesystem is ext4 and no others changed. The > results is below: > > *Rados Bench* > > > > *Bandwidth(MB/sec)* > > > > *Average Latency* > > > > *Max Latency* > > > > *Min Latency* > > > > *Stddev Latency* > > > > *Stddev Bandwidth(MB/sec)* > > > > *Max Bandwidth(MB/sec)* > > > > *Min Bandwidth(MB/sec)* > > > > > *KVStore* > > > > *FileStore* > > > > *KVStore* > > > > *FileStore* > > > > *KVStore* > > > > *FileStore* > > > > *KVStore* > > > > *FileStore* > > > > *KVStore* > > > > *FileStore* > > > > *KVStore* > > > > *FileStore* > > > > *KVStore* > > > > *FileStore* > > > > *KVStore* > > > > *FileStore* > > *Write 30* > > > > > 24.590 > > > > 23.495 > > > > 4.87257 > > > > 5.07716 > > > > 14.752 > > > > 13.0885 > > > > 0.580851 > > > > 0.605118 > > > > 2.97708 > > > > 3.30538 > > > > 9.91938 > > > > 10.5986 > > > > 44 > > > > 76 > > > > 0 > > > > 0 > > *Write 20* > > > > > 23.515 > > > > 23.064 > > > > 3.39745 > > > > 3.45711 > > > > 11.6089 > > > > 11.5996 > > > > 0.169507 > > > > 0.138595 > > > > 2.58285 > > > > 2.75962 > > > > 9.14467 > > > > 8.54156 > > > > 44 > > > > 40 > > > > 0 > > > > 0 > > *Write 10* > > > > > 22.927 > > > > 21.980 > > > > 1.73815 > > > > 1.8198 > > > > 5.53792 > > > > 6.46675 > > > > 0.171028 > > > > 0.143392 > > > > 1.05982 > > > > 1.20303 > > > > 9.18403 > > > > 8.74401 > > > > 44 > > > > 40 > > > > 0 > > > > 0 > > *Write 5* > > > > > 19.680 > > > > 20.017 > > > > 1.01492 > > > > 0.997019 > > > > 3.10783 > > > > 3.05008 > > > > 0.143758 > > > > 0.138161 > > > > 0.561548 > > > > 0.571459 > > > > 5.92575 > > > > 6.844 > > > > 36 > > > > 32 > > > > 0 > > > > 0 > > *Read 30* > > > > > 65.852 > > > > 60.688 > > > > 1.80069 > > > > 1.96009 > > > > 9.30039 > > > > 10.1146 > > > > 0.115153 > > > > 0.061657 > > > > > > > > > > > > > > > > > > > > > > > > > > *Read 20* > > > > > 59.372 > > > > 60.738 > > > > 1.30479 > > > > 1.28383 > > > > 6.28435 > > > > 8.21304 > > > > 0.016843 > > > > 0.012073 > > > > > > > > > > > > > > > > > > > > > > > > > > *Read 10* > > > > > 65.502 > > > > 55.814 > > > > 0.608805 > > > > 0.7087 > > > > 3.3917 > > > > 4.72626 > > > > 0.016267 > > > > 0.011998 > > > > > > > > > > > > > > > > > > > > > > > > > > *Read 5* > > > > > 64.176 > > > > 54.928 > > > > 0.307111 > > > > 0.364077 > > > > 1.76391 > > > > 1.90182 > > > > 0.017174 > > > > 0.011999 > > > > > > > > > > > > > > > > > > > > > > > > > > > Charts can be view here(http://img42.com/ziwjP+) and > (http://img42.com/LKhoo+) > > > From above, I'm feeling relieved that the LevelDB backend isn't > useless. Most of metrics are better and if increasing cache size for > LevelDB the results may be more attractive. > Even more, LevelDB backend is used by "KeyValueStore" and much of > optimizations can be done to improve performance such as increase > parallel threads or optimize io path. > > Next, I use "rbd bench-write" to test. The result is pity: > > *RBD Bench-Write* > > > > *OPS/sec* > > > > *Bytes/sec* > > *KVStore* > > > > *FileStore* > > > > *KVStore* > > > > *FileStore* > > *Seq 4096 5* > > > > 27.42 > > > > 716.55 > > > > 111861.51 > > > > 2492149.21 > > *Rand 4096 5* > > > > > 28.27 > > > > 504 > > > > 112331.42 > > > > 1683151.29 > > > Just because kv backend doesn't support read/write operation with > offset/length argument, each read/write operation will call a additional > read LevelDB api to do. Much of time is consumed by reading entire large > object in rbd situation. There exists some ways to change such as split > large object to multi small objects or save metadata to avoid read > arduous operation. > > As sage mentioned in <osd: new key/value > backend>(http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Firefly/osd:_new_key%2F%2Fvalue_backend), > more kv backends can be added now and I look forward to more people > interested it. I think radosgw situation can fit in kv store in short ti > > -- > > Best Regards, > > Wheat > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > -- > Best Regards, > > Wheat > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com