LevelDB Backend For Ceph OSD Preview

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

For Emperor blueprint(http://wiki.ceph.com/01Planning/02Blueprints/Emperor/Add_LevelDB_support_to_ceph_cluster_backend_store), I'm sorry to delay the progress. Now, I have done the most of the works for the blueprint's goal. Because of sage's F blueprint(http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Firefly/osd:_new_key%2F%2Fvalue_backend), I need to adjust some codes to match it. The branch is here(https://github.com/yuyuyu101/ceph/tree/wip/6173).

I have tested the LevelDB backend on three nodes(eight OSDs) and compare it to FileStore(ext4). I just use intern benchmark tool "rados bench" to get the comparison. The default ceph configurations is used and replication size is 2. The filesystem is ext4 and no others changed. The results is below:

Rados Bench

Bandwidth(MB/sec)

Average Latency

Max Latency

Min Latency

Stddev Latency

Stddev Bandwidth(MB/sec)

Max Bandwidth(MB/sec)

Min Bandwidth(MB/sec)


KVStore

FileStore

KVStore

FileStore

KVStore

FileStore

KVStore

FileStore

KVStore

FileStore

KVStore

FileStore

KVStore

FileStore

KVStore

FileStore

Write 30

24.590

23.495

4.87257

5.07716

14.752

13.0885

0.580851

0.605118

2.97708

3.30538

9.91938

10.5986

44

76

0

0

Write 20

23.515

23.064

3.39745

3.45711

11.6089

11.5996

0.169507

0.138595

2.58285

2.75962

9.14467

8.54156

44

40

0

0

Write 10

22.927

21.980

1.73815

1.8198

5.53792

6.46675

0.171028

0.143392

1.05982

1.20303

9.18403

8.74401

44

40

0

0

Write 5

19.680

20.017

1.01492

0.997019

3.10783

3.05008

0.143758

0.138161

0.561548

0.571459

5.92575

6.844

36

32

0

0

Read 30

65.852

60.688

1.80069

1.96009

9.30039

10.1146

0.115153

0.061657









Read 20

59.372

60.738

1.30479

1.28383

6.28435

8.21304

0.016843

0.012073









Read 10

65.502

55.814

0.608805

0.7087

3.3917

4.72626

0.016267

0.011998









Read 5

64.176

54.928

0.307111

0.364077

1.76391

1.90182

0.017174

0.011999










Charts can be view here(http://img42.com/ziwjP+) and (http://img42.com/LKhoo+)


From above, I'm feeling relieved that the LevelDB backend isn't useless. Most of metrics are better and if increasing cache size for LevelDB the results may be more attractive.
Even more, LevelDB backend is used by "KeyValueStore" and much of optimizations can be done to improve performance such as increase parallel threads or optimize io path.

Next, I use "rbd bench-write" to test. The result is pity:

RBD Bench-Write

OPS/sec

Bytes/sec

KVStore

FileStore

KVStore

FileStore

Seq 4096 5

27.42

716.55

111861.51

2492149.21

Rand 4096 5

28.27

504

112331.42

1683151.29


Just because kv backend doesn't support read/write operation with offset/length argument, each read/write operation will call a additional read LevelDB api to do. Much of time is consumed by reading entire large object in rbd situation. There exists some ways to change such as split large object to multi small objects or save metadata to avoid read arduous operation.

As sage mentioned in <osd: new key/value backend>(http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Firefly/osd:_new_key%2F%2Fvalue_backend), more kv backends can be added now and I look forward to more people interested it. I think radosgw situation can fit in kv store in short ti

--

Best Regards,

Wheat

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux