Re: Ceph Performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/11/2014 09:48 PM, Bradley Kite wrote:
On 11 January 2014 10:40, Cedric Lemarchand <cedric@xxxxxxxxxxx
<mailto:cedric@xxxxxxxxxxx>>wrote:


    Le 10/01/2014 17:16, Bradley Kite a écrit :

        This might explain why the performance is not so good - on each
        connection it can only do 1 transaction at a time:

        1) Submit write
        2) wait...
        3) Receive ACK

        Then repeat...

        But if the OSD protocol supports multiple transactions it could
        do something like this:

        1) Submit 1
        2) Submit 2
        3) Submit 3
        4) Recv ACK 1
        5) Submit 4
        6) Recv ACK 2

    What you are describing here are sync and async writes, from what I
    understand those behaviours are defined by initial write calls flags
    (posix O_SYNC or O_ASYNC), and ideally have to be honoured all the
    way to the final storage back-end (the drives) to ensure data
    consistency (if something goes wrong). For exemple : fio => iSCSI =>
    CEPH => drives

    What options did you pass to your benchmark tool ?



Hi Cedric,

I used fio for testing, with the following options:

rw=randwrite
filename=/dev/sdc # iscsi mapped device
# filename=/dev/rbd1 # rbd kernel mapped device
ioengine=posixaio
iodepth=256
direct=1
runtime=60
ramp_time=30
blocksize=4k

So it is trying to do async writes, but if it can only submit to 12
OSD's at one time then this might be causing the poor performance - eg
it cannot fit 256 (iodepth) IO's into 12 TCP connections.


Keep one thing in mind with Ceph: "Consistency goes OVER availability"

A OSD is ofcourse able to handle multiple IO operations at the same time, but it's also up to the client to send those operations in parallel.

When the Primary OSD of a Placement Group receives the write it will replicate to the Secondary OSD(s) for that PG. Only when ALL OSDs for the PG have received the write and wrote it to the journal does the Primary OSD send back the "Ack" to the client.

Network latency and journal latency are key here.

I dont know how to confirm this though.

Regards
--
Brad.



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux