Re: LevelDB Backend For Ceph OSD Preview

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 27 Nov 2013, Sebastien Han wrote:
> Hi Sage,
> If I recall correctly during the summit you mentioned that it was possible to disable the journal.
> Is it still part of the plan?

For the kv backend, yeah, since the key/value store will handle making 
things transactional.

Haomai, I still haven't had a chance to look this over yet; it's on my 
list!  Did you look at the summit notes or watch the session?

sage


> 
> ???? 
> S?bastien Han 
> Cloud Engineer 
> 
> "Always give 100%. Unless you're giving blood.? 
> 
> Phone: +33 (0)1 49 70 99 72 
> Mail: sebastien.han@xxxxxxxxxxxx 
> Address : 10, rue de la Victoire - 75009 Paris 
> Web : www.enovance.com - Twitter : @enovance 
> 
> On 25 Nov 2013, at 10:00, Sebastien Han <sebastien.han@xxxxxxxxxxxx> wrote:
> 
> > Nice job Haomai!
> > 
> > ???? 
> > S?bastien Han 
> > Cloud Engineer 
> > 
> > "Always give 100%. Unless you're giving blood.? 
> > 
> > Phone: +33 (0)1 49 70 99 72 
> > Mail: sebastien.han@xxxxxxxxxxxx 
> > Address : 10, rue de la Victoire - 75009 Paris 
> > Web : www.enovance.com - Twitter : @enovance 
> > 
> > On 25 Nov 2013, at 02:50, Haomai Wang <haomaiwang@xxxxxxxxx> wrote:
> > 
> >> 
> >> 
> >> 
> >> On Mon, Nov 25, 2013 at 2:17 AM, Mark Nelson <mark.nelson@xxxxxxxxxxx> wrote:
> >> Great Work! This is very exciting!  Did you happen to try RADOS bench at different object sizes and concurrency levels?
> >> 
> >> 
> >> Maybe can try it later. :-)
> >> 
> >> Mark
> >> 
> >> 
> >> On 11/24/2013 03:01 AM, Haomai Wang wrote:
> >> Hi all,
> >> 
> >> For Emperor
> >> blueprint(http://wiki.ceph.com/01Planning/02Blueprints/Emperor/Add_LevelDB_support_to_ceph_cluster_backend_store),
> >> I'm sorry to delay the progress. Now, I have done the most of the works
> >> for the blueprint's goal. Because of sage's F
> >> blueprint(http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Firefly/osd:_new_key%2F%2Fvalue_backend),
> >> I need to adjust some codes to match it. The branch is
> >> here(https://github.com/yuyuyu101/ceph/tree/wip/6173).
> >> 
> >> I have tested the LevelDB backend on three nodes(eight OSDs) and compare
> >> it to FileStore(ext4). I just use intern benchmark tool "rados bench" to
> >> get the comparison. The default ceph configurations is used and
> >> replication size is 2. The filesystem is ext4 and no others changed. The
> >> results is below:
> >> 
> >> *Rados Bench*
> >> 
> >> 
> >> 
> >> *Bandwidth(MB/sec)*
> >> 
> >> 
> >> 
> >> *Average Latency*
> >> 
> >> 
> >> 
> >> *Max Latency*
> >> 
> >> 
> >> 
> >> *Min Latency*
> >> 
> >> 
> >> 
> >> *Stddev Latency*
> >> 
> >> 
> >> 
> >> *Stddev Bandwidth(MB/sec)*
> >> 
> >> 
> >> 
> >> *Max Bandwidth(MB/sec)*
> >> 
> >> 
> >> 
> >> *Min Bandwidth(MB/sec)*
> >> 
> >> 
> >> 
> >> 
> >> *KVStore*
> >> 
> >> 
> >> 
> >> *FileStore*
> >> 
> >> 
> >> 
> >> *KVStore*
> >> 
> >> 
> >> 
> >> *FileStore*
> >> 
> >> 
> >> 
> >> *KVStore*
> >> 
> >> 
> >> 
> >> *FileStore*
> >> 
> >> 
> >> 
> >> *KVStore*
> >> 
> >> 
> >> 
> >> *FileStore*
> >> 
> >> 
> >> 
> >> *KVStore*
> >> 
> >> 
> >> 
> >> *FileStore*
> >> 
> >> 
> >> 
> >> *KVStore*
> >> 
> >> 
> >> 
> >> *FileStore*
> >> 
> >> 
> >> 
> >> *KVStore*
> >> 
> >> 
> >> 
> >> *FileStore*
> >> 
> >> 
> >> 
> >> *KVStore*
> >> 
> >> 
> >> 
> >> *FileStore*
> >> 
> >> *Write 30*
> >> 
> >> 
> >> 
> >> 
> >> 24.590
> >> 
> >> 
> >> 
> >> 23.495
> >> 
> >> 
> >> 
> >> 4.87257
> >> 
> >> 
> >> 
> >> 5.07716
> >> 
> >> 
> >> 
> >> 14.752
> >> 
> >> 
> >> 
> >> 13.0885
> >> 
> >> 
> >> 
> >> 0.580851
> >> 
> >> 
> >> 
> >> 0.605118
> >> 
> >> 
> >> 
> >> 2.97708
> >> 
> >> 
> >> 
> >> 3.30538
> >> 
> >> 
> >> 
> >> 9.91938
> >> 
> >> 
> >> 
> >> 10.5986
> >> 
> >> 
> >> 
> >> 44
> >> 
> >> 
> >> 
> >> 76
> >> 
> >> 
> >> 
> >> 0
> >> 
> >> 
> >> 
> >> 0
> >> 
> >> *Write 20*
> >> 
> >> 
> >> 
> >> 
> >> 23.515
> >> 
> >> 
> >> 
> >> 23.064
> >> 
> >> 
> >> 
> >> 3.39745
> >> 
> >> 
> >> 
> >> 3.45711
> >> 
> >> 
> >> 
> >> 11.6089
> >> 
> >> 
> >> 
> >> 11.5996
> >> 
> >> 
> >> 
> >> 0.169507
> >> 
> >> 
> >> 
> >> 0.138595
> >> 
> >> 
> >> 
> >> 2.58285
> >> 
> >> 
> >> 
> >> 2.75962
> >> 
> >> 
> >> 
> >> 9.14467
> >> 
> >> 
> >> 
> >> 8.54156
> >> 
> >> 
> >> 
> >> 44
> >> 
> >> 
> >> 
> >> 40
> >> 
> >> 
> >> 
> >> 0
> >> 
> >> 
> >> 
> >> 0
> >> 
> >> *Write 10*
> >> 
> >> 
> >> 
> >> 
> >> 22.927
> >> 
> >> 
> >> 
> >> 21.980
> >> 
> >> 
> >> 
> >> 1.73815
> >> 
> >> 
> >> 
> >> 1.8198
> >> 
> >> 
> >> 
> >> 5.53792
> >> 
> >> 
> >> 
> >> 6.46675
> >> 
> >> 
> >> 
> >> 0.171028
> >> 
> >> 
> >> 
> >> 0.143392
> >> 
> >> 
> >> 
> >> 1.05982
> >> 
> >> 
> >> 
> >> 1.20303
> >> 
> >> 
> >> 
> >> 9.18403
> >> 
> >> 
> >> 
> >> 8.74401
> >> 
> >> 
> >> 
> >> 44
> >> 
> >> 
> >> 
> >> 40
> >> 
> >> 
> >> 
> >> 0
> >> 
> >> 
> >> 
> >> 0
> >> 
> >> *Write 5*
> >> 
> >> 
> >> 
> >> 
> >> 19.680
> >> 
> >> 
> >> 
> >> 20.017
> >> 
> >> 
> >> 
> >> 1.01492
> >> 
> >> 
> >> 
> >> 0.997019
> >> 
> >> 
> >> 
> >> 3.10783
> >> 
> >> 
> >> 
> >> 3.05008
> >> 
> >> 
> >> 
> >> 0.143758
> >> 
> >> 
> >> 
> >> 0.138161
> >> 
> >> 
> >> 
> >> 0.561548
> >> 
> >> 
> >> 
> >> 0.571459
> >> 
> >> 
> >> 
> >> 5.92575
> >> 
> >> 
> >> 
> >> 6.844
> >> 
> >> 
> >> 
> >> 36
> >> 
> >> 
> >> 
> >> 32
> >> 
> >> 
> >> 
> >> 0
> >> 
> >> 
> >> 
> >> 0
> >> 
> >> *Read 30*
> >> 
> >> 
> >> 
> >> 
> >> 65.852
> >> 
> >> 
> >> 
> >> 60.688
> >> 
> >> 
> >> 
> >> 1.80069
> >> 
> >> 
> >> 
> >> 1.96009
> >> 
> >> 
> >> 
> >> 9.30039
> >> 
> >> 
> >> 
> >> 10.1146
> >> 
> >> 
> >> 
> >> 0.115153
> >> 
> >> 
> >> 
> >> 0.061657
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> *Read 20*
> >> 
> >> 
> >> 
> >> 
> >> 59.372
> >> 
> >> 
> >> 
> >> 60.738
> >> 
> >> 
> >> 
> >> 1.30479
> >> 
> >> 
> >> 
> >> 1.28383
> >> 
> >> 
> >> 
> >> 6.28435
> >> 
> >> 
> >> 
> >> 8.21304
> >> 
> >> 
> >> 
> >> 0.016843
> >> 
> >> 
> >> 
> >> 0.012073
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> *Read 10*
> >> 
> >> 
> >> 
> >> 
> >> 65.502
> >> 
> >> 
> >> 
> >> 55.814
> >> 
> >> 
> >> 
> >> 0.608805
> >> 
> >> 
> >> 
> >> 0.7087
> >> 
> >> 
> >> 
> >> 3.3917
> >> 
> >> 
> >> 
> >> 4.72626
> >> 
> >> 
> >> 
> >> 0.016267
> >> 
> >> 
> >> 
> >> 0.011998
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> *Read 5*
> >> 
> >> 
> >> 
> >> 
> >> 64.176
> >> 
> >> 
> >> 
> >> 54.928
> >> 
> >> 
> >> 
> >> 0.307111
> >> 
> >> 
> >> 
> >> 0.364077
> >> 
> >> 
> >> 
> >> 1.76391
> >> 
> >> 
> >> 
> >> 1.90182
> >> 
> >> 
> >> 
> >> 0.017174
> >> 
> >> 
> >> 
> >> 0.011999
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> Charts can be view here(http://img42.com/ziwjP+) and
> >> (http://img42.com/LKhoo+)
> >> 
> >> 
> >> From above, I'm feeling relieved that the LevelDB backend isn't
> >> useless. Most of metrics are better and if increasing cache size for
> >> LevelDB the results may be more attractive.
> >> Even more, LevelDB backend is used by "KeyValueStore" and much of
> >> optimizations can be done to improve performance such as increase
> >> parallel threads or optimize io path.
> >> 
> >> Next, I use "rbd bench-write" to test. The result is pity:
> >> 
> >> *RBD Bench-Write*
> >> 
> >> 
> >> 
> >> *OPS/sec*
> >> 
> >> 
> >> 
> >> *Bytes/sec*
> >> 
> >> *KVStore*
> >> 
> >> 
> >> 
> >> *FileStore*
> >> 
> >> 
> >> 
> >> *KVStore*
> >> 
> >> 
> >> 
> >> *FileStore*
> >> 
> >> *Seq 4096 5*
> >> 
> >> 
> >> 
> >> 27.42
> >> 
> >> 
> >> 
> >> 716.55
> >> 
> >> 
> >> 
> >> 111861.51
> >> 
> >> 
> >> 
> >> 2492149.21
> >> 
> >> *Rand 4096 5*
> >> 
> >> 
> >> 
> >> 
> >> 28.27
> >> 
> >> 
> >> 
> >> 504
> >> 
> >> 
> >> 
> >> 112331.42
> >> 
> >> 
> >> 
> >> 1683151.29
> >> 
> >> 
> >> Just because kv backend doesn't support read/write operation with
> >> offset/length argument, each read/write operation will call a additional
> >> read LevelDB api to do. Much of time is consumed by reading entire large
> >> object in rbd situation. There exists some ways to change such as split
> >> large object to multi small objects or save metadata to avoid read
> >> arduous operation.
> >> 
> >> As sage mentioned in <osd: new key/value
> >> backend>(http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Firefly/osd:_new_key%2F%2Fvalue_backend),
> >> more kv backends can be added now and I look forward to more people
> >> interested it. I think radosgw situation can fit in kv store in short ti
> >> 
> >> --
> >> 
> >> Best Regards,
> >> 
> >> Wheat
> >> 
> >> 
> >> 
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users@xxxxxxxxxxxxxx
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> 
> >> 
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users@xxxxxxxxxxxxxx
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> 
> >> 
> >> 
> >> -- 
> >> Best Regards,
> >> 
> >> Wheat
> >> 
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users@xxxxxxxxxxxxxx
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux