about using SSD in cephfs, attached with some quantified benchmarks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



confusing questions: (ceph0.94)

1. Is there any way to cache the whole metadata datas into MDS's memory ?   (metadata osds dates-----async----> MDS memory)

I dunno if I misunderstand the role of mds :(,   so many post threads that advising Using SSD osds for metadata.
the metadata stores Inode information for files. Yes, It's fast to stat, ls, readdir for cephfs,
but If metadatas could be cached in memory,  metadata osds datas -----async----> MDS memory, I guess, this may be better ?
we can use ssd journals, so write speed would not be the bottleneck.  cached mdatadatas are not large even if there are huge number of files.  ( I got that MooseFS strores all metadata in memory ?)

2.
Any descriptions for the Journal under hood ? though it's like swap partition for Linux ~

using a Intel PCIE SSD as journals of HDD osds,
I ran command blow for a  rough benchmark all of the osds simultaneously,

# for i in $(ps aux | grep osd | awk '{print $14}' | grep -v "^$" | sort); do ceph tell osd.$i bench & done

compared to anther HOST without SSD journal, these got a better bytes_per_sec, rising about more than 100%.
HDD journal OSDS  30MB/s ------> HDD OSDS with SSD journal more than 60MB/s.   (12 osds/host, hosts are almost same)

MB/s (HDD journal + HDD)
39
35
35
35
33
31
29
26
26
26
26
25
the top 39MB/s one an stata SSD OSD with stata SSD journal, but the speed seems to be  not faster than the otheres with HDD Journal + HDD Data.

MB/s (PCIE SSD Journal + HDD)
195
129
92
88
71
71
65
61
57
54
52
50
that 195MB/s is PCIE SSD Journal + SSD Data, the speed seems to be very fast. others are PCIE SSD Journal + HDD Data,


"bytes_per_sec": 166451390.000000 for single bench on (PCIE Journal + HDD)    158.74MB/s
"bytes_per_sec": 78472933.000000 for single bench on   (HDD Journal + HDD)       74.83MB/s

It seems that  "data ---> HDD Journal" is probable the main bottleneck ? how to track this
data ----> SSD Journal ------> osd data partition
data ----> HDD Journal -----> osd data partition

3.
any  cache or memory suggestion for better performance for cephfs?

key ceph.conf as below
[global]
osd pool default size = 2
osd pool default min size = 1
osd pool default pg num = 512
osd pool default pgp num = 512
osd journal size = 10000

[mds]
mds cache size = 11474836

[osd]
osd op threads = 4
filestore op threads = 4
osd crush update on start = false
#256M
osd max write size = 256
#256M
journal max write bytes = 268435456

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux