Re: LevelDB Backend For Ceph OSD Preview

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm curious about this too. With a leveldb backend, how would we invoke leveldb to get around our current journaling requirements?

On 11/26/2013 05:06 PM, Sebastien Han wrote:
Hi Sage,
If I recall correctly during the summit you mentioned that it was possible to disable the journal.
Is it still part of the plan?

––––
Sébastien Han
Cloud Engineer

"Always give 100%. Unless you're giving blood.”

Phone: +33 (0)1 49 70 99 72
Mail: sebastien.han@xxxxxxxxxxxx
Address : 10, rue de la Victoire - 75009 Paris
Web : www.enovance.com - Twitter : @enovance

On 25 Nov 2013, at 10:00, Sebastien Han <sebastien.han@xxxxxxxxxxxx> wrote:

Nice job Haomai!

––––
Sébastien Han
Cloud Engineer

"Always give 100%. Unless you're giving blood.”

Phone: +33 (0)1 49 70 99 72
Mail: sebastien.han@xxxxxxxxxxxx
Address : 10, rue de la Victoire - 75009 Paris
Web : www.enovance.com - Twitter : @enovance

On 25 Nov 2013, at 02:50, Haomai Wang <haomaiwang@xxxxxxxxx> wrote:




On Mon, Nov 25, 2013 at 2:17 AM, Mark Nelson <mark.nelson@xxxxxxxxxxx> wrote:
Great Work! This is very exciting!  Did you happen to try RADOS bench at different object sizes and concurrency levels?


Maybe can try it later. :-)

Mark


On 11/24/2013 03:01 AM, Haomai Wang wrote:
Hi all,

For Emperor
blueprint(http://wiki.ceph.com/01Planning/02Blueprints/Emperor/Add_LevelDB_support_to_ceph_cluster_backend_store),
I'm sorry to delay the progress. Now, I have done the most of the works
for the blueprint's goal. Because of sage's F
blueprint(http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Firefly/osd:_new_key%2F%2Fvalue_backend),
I need to adjust some codes to match it. The branch is
here(https://github.com/yuyuyu101/ceph/tree/wip/6173).

I have tested the LevelDB backend on three nodes(eight OSDs) and compare
it to FileStore(ext4). I just use intern benchmark tool "rados bench" to
get the comparison. The default ceph configurations is used and
replication size is 2. The filesystem is ext4 and no others changed. The
results is below:

*Rados Bench*



*Bandwidth(MB/sec)*



*Average Latency*



*Max Latency*



*Min Latency*



*Stddev Latency*



*Stddev Bandwidth(MB/sec)*



*Max Bandwidth(MB/sec)*



*Min Bandwidth(MB/sec)*




*KVStore*



*FileStore*



*KVStore*



*FileStore*



*KVStore*



*FileStore*



*KVStore*



*FileStore*



*KVStore*



*FileStore*



*KVStore*



*FileStore*



*KVStore*



*FileStore*



*KVStore*



*FileStore*

*Write 30*




24.590



23.495



4.87257



5.07716



14.752



13.0885



0.580851



0.605118



2.97708



3.30538



9.91938



10.5986



44



76



0



0

*Write 20*




23.515



23.064



3.39745



3.45711



11.6089



11.5996



0.169507



0.138595



2.58285



2.75962



9.14467



8.54156



44



40



0



0

*Write 10*




22.927



21.980



1.73815



1.8198



5.53792



6.46675



0.171028



0.143392



1.05982



1.20303



9.18403



8.74401



44



40



0



0

*Write 5*




19.680



20.017



1.01492



0.997019



3.10783



3.05008



0.143758



0.138161



0.561548



0.571459



5.92575



6.844



36



32



0



0

*Read 30*




65.852



60.688



1.80069



1.96009



9.30039



10.1146



0.115153



0.061657

























*Read 20*




59.372



60.738



1.30479



1.28383



6.28435



8.21304



0.016843



0.012073

























*Read 10*




65.502



55.814



0.608805



0.7087



3.3917



4.72626



0.016267



0.011998

























*Read 5*




64.176



54.928



0.307111



0.364077



1.76391



1.90182



0.017174



0.011999


























Charts can be view here(http://img42.com/ziwjP+) and
(http://img42.com/LKhoo+)


 From above, I'm feeling relieved that the LevelDB backend isn't
useless. Most of metrics are better and if increasing cache size for
LevelDB the results may be more attractive.
Even more, LevelDB backend is used by "KeyValueStore" and much of
optimizations can be done to improve performance such as increase
parallel threads or optimize io path.

Next, I use "rbd bench-write" to test. The result is pity:

*RBD Bench-Write*



*OPS/sec*



*Bytes/sec*

*KVStore*



*FileStore*



*KVStore*



*FileStore*

*Seq 4096 5*



27.42



716.55



111861.51



2492149.21

*Rand 4096 5*




28.27



504



112331.42



1683151.29


Just because kv backend doesn't support read/write operation with
offset/length argument, each read/write operation will call a additional
read LevelDB api to do. Much of time is consumed by reading entire large
object in rbd situation. There exists some ways to change such as split
large object to multi small objects or save metadata to avoid read
arduous operation.

As sage mentioned in <osd: new key/value
backend>(http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Firefly/osd:_new_key%2F%2Fvalue_backend),
more kv backends can be added now and I look forward to more people
interested it. I think radosgw situation can fit in kv store in short ti

--

Best Regards,

Wheat



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Best Regards,

Wheat

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux