RE: Rados Performance help needed...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Correction:

Missed a carriage return when I copy/pasted at first, sorry...

Ryan

-----Original Message-----
From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Ryan Nicholson
Sent: Wednesday, October 31, 2012 5:50 PM
To: ceph-devel@xxxxxxxxxxxxxxx
Subject: Rados Performance help needed...

Guys:
I have some tuning questions. I'm not getting the write speeds I'm expecting, and am open to suggestions.
I using Rados, on Ceph 0.48.0. I have 12 OSD's split up (using crush/rados pools) into 2 pools this way:
4 OSD Servers
                - Dell 2850's, 12GB ram
                - 64-bit CentOS 6.2
                - 2 SCSI osd's each
                - 1 eSATA osd each
                - Each server is connected to lan using 2GB/s bond 
3 Separate Monitor Servers as well
		- Dell 1850's, 8 GB ram
SCSI pool:
                8 OSD's of 146GB a piece, each made of a pair of Ultra-320 disks in a stripe.
                Disks are partitioned with 20GB at the front for Ceph journal, and the remainder is the OSD partition. Journal is for the OSD on the same disk.
                Formatted ext4, mounted (rw,noatime,data=writeback,barrier=0,nobh)

Large "data" pool:
                4 OSD's of 3.2TB a piece, each made of an identically configured LaCie Big4 Disk Quadra.
                Each LaCie's specs are: 4TB of 7200RPM SATA, in a RAID-5 which is handled directly by the local hardware, and attached to the server using a 64-bit Sil3124 eSATA card.
                Note: The LaCie's do NOT support JBOD, believe it or not. you can stripe them tho.
                Formatted ext4, mounted (rw,noatime,data=writeback,barrier=0,nobh)
                Journal for these disks is an entire 146GB SCSI stripe each


Ok, so here's my issue:
As a test, I stopped ceph completely, and at each server, i did a simple disk/filesystem write/read test. here are those results:
command: (dd if=10GB_testFile of /path-to-osd-mount/test.io BS=1048576)

                SCSI: 10GB file written at 365MB/s to a single SCSI stripe of 2 73GB Ultra-320's. 
                                same 10GB file read fat 595MB/s.
                eSATA: (same) 10GB file written at 275MB/s to a single LaCie and I read that file back at 443MB/s.

I understand that real life numbers and dd numbers are hardly ever the same.
Now using ceph, I produced 2 identical Rados images called Test20G, and put in both of the pools. I know (leterally by watching the drive access lights) that the pools are created to the proper place and writing the desired drives. The images were mounted using the same client, and formatted ext4. i mounted (rw,noatime,data=writeback,barrier=0,nobh).
i did the same dd test, and found the SCSI's to write average 105MB/s, and the LaCie's write averaged 25MB/s.
Network Util for each of teh server never crossed 6% on ANY osd server, the sending client never exceeded 27% util.
Now, Help me, I have an entire 140GB journal for the Lacies. I expected them to fare much better than they did??

Any help would be appreciated! Thanks!
Ryan nicholson

Here's the top fo my OSD map: (SCSI is the fast SCSI pool, data is the LaCie Pool) pool 0 'data' rep size 2 crush_ruleset 0 object_hash rjenkins pg_num 768 pgp_num 768 last_change 1 owner 0 crash_replay_interval 45 pool 1 'metadata' rep size 2 crush_ruleset 1 object_hash rjenkins pg_num 768 pgp_num 768 last_change 1 owner 0 pool 2 'rbd' rep size 2 crush_ruleset 2 object_hash rjenkins pg_num 768 pgp_num 768 last_change 1 owner 0 pool 3 'SCSI' rep size 2 crush_ruleset 3 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 12 owner 0
---------------ceph.conf---------------
; Ceph Cluster SCSI-System Configuration - Ryan Nicholson 10-28-2012 SunPM

[global]
;        auth supported = cephx
        log file = /ceph/log/ceph-$name.log
        pid file = /ceph/lock/ceph-$name.pid
        
[client]
        rbd cache = true

[mon]
        mon data = /ceph/mon/$name


[mon.a]
        host = ceph-mdc1.storage.broadcast.kcrg
        mon addr = 10.9.181.10:6789
;       debug mon = 1
[mon.b]
        host = ceph-mdc2.storage.broadcast.kcrg
        mon addr = 10.9.181.11:6789
;       debug mon = 1
[mon.c]
        host = ceph-mdc3.storage.broadcast.kcrg
        mon addr = 10.9.181.12:6789
;       debug mon = 1

[mds]
;       keyring = /ceph/security/keyring.$name

[mds.a]
        host = ceph-mdc1.storage.broadcast.kcrg ;       debug mds = 1

[mds.b]
        host = ceph-mdc2.storage.broadcast.kcrg ;       debug mds = 1 [mds.c]
        host = ceph-mdc3.storage.broadcast.kcrg ;       debug mds = 1

[osd]
        osd journal size = 0
        osd data = /ceph/osd/$name
;       debug osd = 1
        journal dio = true
        journal aio = true
        filestore xattr use omap = true

[osd.0]
        host = ceph-osd1.storage.broadcast.kcrg
        osd journal = /dev/sda1
[osd.1]
        host = ceph-osd1.storage.broadcast.kcrg
        osd journal = /dev/sdb1
[osd.2]
        host = ceph-osd1.storage.broadcast.kcrg
        osd journal = /dev/sdc

[osd.3]
        host = ceph-osd2.storage.broadcast.kcrg
        osd journal = /dev/sda1
[osd.4]
        host = ceph-osd2.storage.broadcast.kcrg
        osd journal = /dev/sdb1
[osd.5]
        host = ceph-osd2.storage.broadcast.kcrg
        osd journal = /dev/sdc

[osd.6]
        host = ceph-osd3.storage.broadcast.kcrg
        osd journal = /dev/sda1
[osd.7]
        host = ceph-osd3.storage.broadcast.kcrg
        osd journal = /dev/sdb1
[osd.8]
        host = ceph-osd3.storage.broadcast.kcrg
        osd journal = /dev/sdc

[osd.9]
        host = ceph-osd4.storage.broadcast.kcrg
        osd journal = /dev/sda1
[osd.10]
        host = ceph-osd4.storage.broadcast.kcrg
        osd journal = /dev/sdb1
[osd.11]
        host = ceph-osd4.storage.broadcast.kcrg
        osd journal = /dev/sdc



Ryan Nicholson|Engineer
KCRG-TV9 | Digital Television
 319.361.5102
 501 Second Ave SE
 Cedar Rapids, IA 52401
 ryan.nicholson@xxxxxxxx




--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux