Re: Performence test on ceph v0.23 + EXT4 and Btrfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




å 2010-12-01äç 09:35 +0800ïJeff Wuåéï 
> 
> å 2010-12-01äç 01:07 +0800ïGregory Farnumåéï
> > On Mon, Nov 29, 2010 at 10:19 PM, Jeff Wu <cpwu@xxxxxxxxxxxxx> wrote:
> > > Is "40-50MB/s" the speed that it run bench at local btrfs disk ?
> > > not the speed that run bench from client to osd server ?
> > > with this speed ,run bench from client to osd server ,will which  get
> > > about 20~25MB/s( 40~50MB /2 )speed ?
> > Data on Ceph is replicated across 2 OSDs (by default; this is
> > configurable). So while figuring out potential performance involves a
> > lot of variables, in a simple case like this where you aren't bounded
> > by network bandwidth you'll find that your read/write performance
> > simply tracks the slower disk. I'd expect your Ceph tests (at least
> > the streaming ones) to run at 40-50MB/s.
> 
> Hi Greg,thank you very much for your quickly reply.
> > 
> > Given that everything else is okay, I cannot stress enough that
> > running without a journal is going to cause significant performance
> > degradations. I have a hard time believing that it's responsible for
> > 13-second latencies, but it's possible. So how about you set up a
> > journal (it can just be a file or new partition on the drives you're
> > already using) and report back your results after you do that. :)
> 
> I will add journal to ceph.conf to try it . 
> 
> 
Hi ,greg, 

With your suggestions, i add the journal config:
"
osd data = /opt/ceph/data/osd$id
osd journal = /home/transoft/data/osd$id/journal
filestore journal writeahead = true
osd journal size = 10000
" 
to ceph.conf.  the  detail ceph.conf attached below.

then , run six times for the commad: "$ sudo ceph osd tell 0/1
bench" ,get the results:


$ sudo ceph -w

osd0 172.16.10.42:6800/17347 1 : [INF] bench: wrote 1024 MB in blocks of
4096 KB in 29.818194 sec at 28201 KB/sec
osd0 172.16.10.42:6800/17347 2 : [INF] bench: wrote 1024 MB in blocks of
4096 KB in 30.013058 sec at 34801 KB/sec
osd0 172.16.10.42:6800/17347 3 : [INF] bench: wrote 1024 MB in blocks of
4096 KB in 30.463511 sec at 30274 KB/sec

osd1 172.16.10.65:6800/4845 1 : [INF] bench: wrote 1024 MB in blocks of
4096 KB in 165.067603 sec at 6329 KB/sec
osd1 172.16.10.65:6800/4845 2 : [INF] bench: wrote 1024 MB in blocks of
4096 KB in 181.034333 sec at 5782 KB/sec
osd1 172.16.10.65:6800/4845 3 : [INF] bench: wrote 1024 MB in blocks of
4096 KB in 196.055812 sec at 5334 KB/sec

and i also use "dd" to test raw drive, get the logs:

1. OSD0, mkfs.btrfs format /opt 

$ sudo dd if=/dev/zero of=/opt/dd.img bs=2M count=1024 
1024+0 records in
1024+0 records out
2147483648 bytes transfered in 21.4497 secs(100 MB/sec)

2. OSD1 ,mkfs. btrfs format /opt 

~$ sudo dd if=/dev/zero of=/opt/dd.img bs=2M count=1024
1024+0 records in
1024+0 records out
2147483648 bytes transfered in 48.2037 secs(44.6 MB/sec)

with these logs, OSD1 disk speed might limit the  test performance.

and i also detect a issue ,take the following steps:

$. mckephfs -c ceph.conf -v --mkbtrfs -a 
$  init-ceph - ceph.conf --btrfs -v -a start 
then execute:
$  init-ceph - ceph.conf --btrfs -v -a stop

this command can't stop OSD0 and OSD1 cosd process:
OSD0:
/usr/local/bin/cosd -i 0 -c ceph.conf
OSD1:
/usr/local/bin/cosd -i 1 -c ceph.conf


then , i create the folder "/var/run/ceph"  at OSD0 and OSD1 host
manually.
execute:
$  init-ceph - ceph.conf --btrfs -v -a stop

this command can  stop OSD0 and OSD1 cosd process:

/usr/local/bin/cosd -i 0 -c ceph.conf
/usr/local/bin/cosd -i 1 -c ceph.conf


Thanks,
Jeff.Wu

> 
> > Adding a journal to the OSDs lets them turn all their random writes
> > into streaming ones.
> > -Greg
> 

=========================================================
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 0 bench
2010-12-01 10:45:13.670910 mon <- [osd,tell,0,bench]
2010-12-01 10:45:13.671180 mon1 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 0 bench
2010-12-01 10:45:29.350198 mon <- [osd,tell,0,bench]
2010-12-01 10:45:29.350457 mon1 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 0 bench
2010-12-01 10:45:31.000281 mon <- [osd,tell,0,bench]
2010-12-01 10:45:31.000560 mon0 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 1 bench
2010-12-01 10:45:34.860782 mon <- [osd,tell,1,bench]
2010-12-01 10:45:34.861020 mon1 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 1 bench
2010-12-01 10:45:36.760811 mon <- [osd,tell,1,bench]
2010-12-01 10:45:36.761161 mon2 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 1 bench
2010-12-01 10:45:37.530714 mon <- [osd,tell,1,bench]
2010-12-01 10:45:37.530968 mon2 -> 'ok' (0)

transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph -w

2010-12-01 10:44:59.450653    pg v13: 528 pgs: 528 active+clean; 12 KB
data, 5304 KB used, 219 GB / 219 GB avail
2010-12-01 10:44:59.451365   mds e5: 1/1/1 up {0=up:active}, 1
up:standby
2010-12-01 10:44:59.451387   osd e6: 2 osds: 2 up, 2 in
2010-12-01 10:44:59.451412   log 2010-12-01 10:43:43.044865 mon0
172.16.10.171:6789/0 7 : [INF] mds0 172.16.10.171:6801/2482 up:active
2010-12-01 10:44:59.451440   mon e1: 3 mons at
{0=172.16.10.171:6789/0,1=172.16.10.171:6790/0,2=172.16.10.171:6791/0}
2010-12-01 10:46:45.000262   log 2010-12-01 10:45:15.599526 osd0
172.16.10.42:6800/17347 1 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 29.818194 sec at 28201 KB/sec
2010-12-01 10:46:45.000262   log 2010-12-01 10:45:46.062142 osd0
172.16.10.42:6800/17347 2 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 30.013058 sec at 34801 KB/sec
2010-12-01 10:46:45.000262   log 2010-12-01 10:46:16.836607 osd0
172.16.10.42:6800/17347 3 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 30.463511 sec at 30274 KB/sec
2010-12-01 10:48:20.042152    pg v14: 528 pgs: 528 active+clean; 32780
KB data, 888 MB used, 218 GB / 219 GB avail
2010-12-01 10:50:50.038298    pg v15: 528 pgs: 528 active+clean; 73740
KB data, 54928 KB used, 219 GB / 219 GB avail
2010-12-01 10:52:15.074470    pg v16: 528 pgs: 528 active+clean; 73740
KB data, 79440 KB used, 219 GB / 219 GB avail
2010-12-01 10:54:55.546098   log 2010-12-01 11:52:34.244851 osd1
172.16.10.65:6800/4845 1 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 165.067603 sec at 6329 KB/sec
2010-12-01 10:54:55.546098   log 2010-12-01 11:55:52.010739 osd1
172.16.10.65:6800/4845 2 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 181.034333 sec at 5782 KB/sec
2010-12-01 10:54:55.546098   log 2010-12-01 11:59:09.560115 osd1
172.16.10.65:6800/4845 3 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 196.055812 sec at 5334 KB/sec
2010-12-01 10:55:01.001357    pg v17: 528 pgs: 528 active+clean; 73741
KB data, 1106 MB used, 218 GB / 219 GB avail


============ceph.conf====================


;
; Sample ceph ceph.conf file.
;
; This file defines cluster membership, the various locations
; that Ceph stores data, and any other runtime options.

; If a 'host' is defined for a daemon, the start/stop script will
; verify that it matches the hostname (or else ignore it).  If it is
; not defined, it is assumed that the daemon is intended to start on
; the current host (e.g., in a setup with a startup.conf on each
; node).

; global
[global]
; enable secure authentication
; auth supported = cephx
keyring = /etc/ceph/keyring.bin
; monitors
;  You need at least one.  You need at least three if you want to
;  tolerate any node failures.  Always create an odd number.
[mon]
mon data = /opt/ceph/data/mon$id
;mon data = /home/transoft/data/mon$id

; logging, for debugging monitor crashes, in order of
; their likelihood of being helpful :)
;debug ms = 20
;debug mon = 20
;debug paxos = 20
;debug auth = 20

[mon0]
host = ubuntu-mon0
mon addr = 172.16.10.171:6789

[mon1]
host = ubuntu-mon0
mon addr = 172.16.10.171:6790

[mon2]
host = ubuntu-mon0
mon addr = 172.16.10.171:6791

; mds
;  You need at least one.  Define two to get a standby.
[mds]
; where the mds keeps it's secret encryption keys
keyring = /etc/ceph/keyring.$name

; mds logging to debug issues.
;debug ms = 20
;debug mds = 20

[mds.0]
host = ubuntu-mon0

[mds.1]
host = ubuntu-mon0

; osd
;  You need at least one.  Two if you want data to be replicated.
;  Define as many as you like.
[osd]
; This is where the btrfs volume will be mounted.
;osd data = /opt/ceph/data/osd$id
osd class tmp = /var/lib/ceph/tmp

; Ideally, make this a separate disk or partition.  A few
; hundred MB should be enough; more if you have fast or many
; disks.  You can use a file under the osd data dir if need be
; (e.g. /data/osd$id/journal), but it will be slower than a
; separate disk or partition.

        ; This is an example of a file-based journal.
;osd journal = /home/transoft/data/osd$id/journal
;filestore journal writeahead = true
; journal size, in megabytes
;osd journal size = 1000 
keyring = /etc/ceph/keyring.$name

; osd logging to debug osd issues, in order of likelihood of being
; helpful
;debug ms = 20
;debug osd = 20
;debug filestore = 20
;debug journal = 20

[osd0]
host = ubuntu-osd0
osd data = /opt/ceph/data/osd$id
osd journal = /home/transoft/data/osd$id/journal
filestore journal writeahead = true
osd journal size = 10000 
; if 'btrfs devs' is not specified, you're responsible for
; setting up the 'osd data' dir.  if it is not btrfs, things
; will behave up until you try to recover from a crash (which
; usually fine for basic testing).
; btrfs devs = /dev/sdx

[osd1]
host = ubuntu-osd1
osd data = /opt/ceph/data/osd$id
osd journal = /home/transoft/data/osd$id/journal
filestore journal writeahead = true
osd journal size = 10000 
;btrfs devs = /dev/sdy

;[osd2]
;host = zeta
;btrfs devs = /dev/sdx

;[osd3]
;host = eta
;btrfs devs = /dev/sdy






--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux