Re: LevelDB Backend For Ceph OSD Preview

Haomai Wang <haomaiwang@xxxxxxxxx> · Mon, 25 Nov 2013 09:50:51 +0800

On Mon, Nov 25, 2013 at 2:17 AM, Mark Nelson <mark.nelson@xxxxxxxxxxx> wrote:

Great Work! This is very exciting!  Did you happen to try RADOS bench at different object sizes and concurrency levels?

Maybe can try it later. :-)

Mark

On 11/24/2013 03:01 AM, Haomai Wang wrote:

Hi all,

For Emperor

blueprint(http://wiki.ceph.com/01Planning/02Blueprints/Emperor/Add_LevelDB_support_to_ceph_cluster_backend_store),

I'm sorry to delay the progress. Now, I have done the most of the works

for the blueprint's goal. Because of sage's F

blueprint(http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Firefly/osd:_new_key%2F%2Fvalue_backend),

I need to adjust some codes to match it. The branch is

here(https://github.com/yuyuyu101/ceph/tree/wip/6173).

I have tested the LevelDB backend on three nodes(eight OSDs) and compare

it to FileStore(ext4). I just use intern benchmark tool "rados bench" to

get the comparison. The default ceph configurations is used and

replication size is 2. The filesystem is ext4 and no others changed. The

results is below:

*Rados Bench*

*Bandwidth(MB/sec)*

*Average Latency*

*Max Latency*

*Min Latency*

*Stddev Latency*

*Stddev Bandwidth(MB/sec)*

*Max Bandwidth(MB/sec)*

*Min Bandwidth(MB/sec)*

*KVStore*

*FileStore*

*KVStore*

*FileStore*

*KVStore*

*FileStore*

*KVStore*

*FileStore*

*KVStore*

*FileStore*

*KVStore*

*FileStore*

*KVStore*

*FileStore*

*KVStore*

*FileStore*

*Write 30*

24.590

23.495

4.87257

5.07716

14.752

13.0885

0.580851

0.605118

2.97708

3.30538

9.91938

10.5986

44

76

0

0

*Write 20*

23.515

23.064

3.39745

3.45711

11.6089

11.5996

0.169507

0.138595

2.58285

2.75962

9.14467

8.54156

44

40

0

0

*Write 10*

22.927

21.980

1.73815

1.8198

5.53792

6.46675

0.171028

0.143392

1.05982

1.20303

9.18403

8.74401

44

40

0

0

*Write 5*

19.680

20.017

1.01492

0.997019

3.10783

3.05008

0.143758

0.138161

0.561548

0.571459

5.92575

6.844

36

32

0

0

*Read 30*

65.852

60.688

1.80069

1.96009

9.30039

10.1146

0.115153

0.061657

*Read 20*

59.372

60.738

1.30479

1.28383

6.28435

8.21304

0.016843

0.012073

*Read 10*

65.502

55.814

0.608805

0.7087

3.3917

4.72626

0.016267

0.011998

*Read 5*

64.176

54.928

0.307111

0.364077

1.76391

1.90182

0.017174

0.011999

Charts can be view here(http://img42.com/ziwjP+) and

(http://img42.com/LKhoo+)

 From above, I'm feeling relieved that the LevelDB backend isn't

useless. Most of metrics are better and if increasing cache size for

LevelDB the results may be more attractive.

Even more, LevelDB backend is used by "KeyValueStore" and much of

optimizations can be done to improve performance such as increase

parallel threads or optimize io path.

Next, I use "rbd bench-write" to test. The result is pity:

*RBD Bench-Write*

*OPS/sec*

*Bytes/sec*

*KVStore*

*FileStore*

*KVStore*

*FileStore*

*Seq 4096 5*

27.42

716.55

111861.51

2492149.21

*Rand 4096 5*

28.27

504

112331.42

1683151.29

Just because kv backend doesn't support read/write operation with

offset/length argument, each read/write operation will call a additional

read LevelDB api to do. Much of time is consumed by reading entire large

object in rbd situation. There exists some ways to change such as split

large object to multi small objects or save metadata to avoid read

arduous operation.

As sage mentioned in <osd: new key/value

backend>(http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Firefly/osd:_new_key%2F%2Fvalue_backend),

more kv backends can be added now and I look forward to more people

interested it. I think radosgw situation can fit in kv store in short ti

--

Best Regards,

Wheat

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 

Best Regards,
Wheat

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com