Re: RBD performance - tuning hints

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/30/2012 09:12 AM, Alexandre DERUMIER wrote:
well, you have to compare
- pure a SSD (via PCIe or SAS-6G)        vs.
- Ceph-Journal, which goes 2x over 10GbE with IP
  Client -> primary-copy -> 2nd-copy
  (= redundancy over Ethernet distance)

Sure but the first osd ack to the client,before replicating to the others osd.

Client -> primary-copy -> 2nd-copy
        <-ack
          primary-copy -> 2nd-copy
                       -> 3st-copy

Or I'm wrong ?

RBD waits for the data to be on disk on all replicas. It's pretty easy
to relax this to in memory on all replicas, but there's no option for
that right now.

Josh


----- Mail original -----

De: "Dieter Kasper" <d.kasper@xxxxxxxxxxxx>
À: "Alexandre DERUMIER" <aderumier@xxxxxxxxx>
Cc: ceph-devel@xxxxxxxxxxxxxxx, "Andreas Bluemle" <andreas.bluemle@xxxxxxxxxxx>
Envoyé: Jeudi 30 Août 2012 18:02:05
Objet: Re: RBD performance - tuning hints

On Thu, Aug 30, 2012 at 05:46:35PM +0200, Alexandre DERUMIER wrote:
Thanks

8x SSD, 200GB each

20000 iops seem pretty low,no ?
well, you have to compare
- pure a SSD (via PCIe or SAS-6G) vs.
- Ceph-Journal, which goes 2x over 10GbE with IP
Client -> primary-copy -> 2nd-copy
(= redundancy over Ethernet distance)

I'm curious about the answer from Inktank,

-Dieter



for @intank:

Is their a bottleneck somewhere in ceph ?
Maybe "SimpleMessenger dispatching: cause of performance problems?"
from Thu, 16 Aug 2012 18:08:39 +0200
by <andreas.bluemle@xxxxxxxxxxx>
can be an answer.
Especially if a small number of OSDs is used.


I said that, because I would like to know if it's scale by adding new nodes.

Does Intank have already done some random iops benchmark ? (I always see sequential throughput bench in the mailing list)


----- Mail original -----

De: "Dieter Kasper" <d.kasper@xxxxxxxxxxxx>
À: "Alexandre DERUMIER" <aderumier@xxxxxxxxx>
Cc: ceph-devel@xxxxxxxxxxxxxxx
Envoyé: Jeudi 30 Août 2012 17:33:42
Objet: Re: RBD performance - tuning hints

On Thu, Aug 30, 2012 at 05:28:02PM +0200, Alexandre DERUMIER wrote:
Thanks for the report !

vs your first benchmark, it's with RBD 4M or 64K ?
with 4MB (see attached config info)

Cheers,
-Dieter


(how much ssd by node?)
8x SSD, 200GB each




----- Mail original -----

De: "Dieter Kasper" <d.kasper@xxxxxxxxxxxx>
À: "Alexandre DERUMIER" <aderumier@xxxxxxxxx>
Cc: ceph-devel@xxxxxxxxxxxxxxx
Envoyé: Jeudi 30 Août 2012 16:56:34
Objet: Re: RBD performance - tuning hints

Hi Alexandre,

with the 4 filestore parameter below some fio values could be increased:
filestore max sync interval = 30
filestore min sync interval = 29
filestore flusher = false
filestore queue max ops = 10000

###### IOPS
fio_read_4k_64: 9373
fio_read_4k_128: 9939
fio_randwrite_8k_16: 12376
fio_randwrite_4k_16: 13315
fio_randwrite_512_32: 13660
fio_randwrite_8k_32: 17318
fio_randwrite_4k_32: 18057
fio_randwrite_8k_64: 19693
fio_randwrite_512_64: 20015 <<<
fio_randwrite_4k_64: 20024 <<<
fio_randwrite_8k_128: 20547 <<<
fio_randwrite_4k_128: 20839 <<<
fio_randwrite_512_128: 21417 <<<
fio_randread_8k_128: 48872
fio_randread_4k_128: 50002
fio_randread_512_128: 51202

###### MB/s
fio_randread_2m_32: 628
fio_read_4m_64: 630
fio_randread_8m_32: 633
fio_read_2m_32: 637
fio_read_4m_16: 640
fio_randread_4m_16: 652
fio_write_2m_32: 660
fio_randread_4m_32: 677
fio_read_4m_32: 678
(...)
fio_write_4m_64: 771
fio_randwrite_2m_64: 789
fio_write_8m_128: 796
fio_write_4m_32: 802
fio_randwrite_4m_128: 807 <<<
fio_randwrite_2m_32: 811 <<<
fio_write_2m_128: 833 <<<
fio_write_8m_64: 901 <<<

Best Regards,
-Dieter


On Wed, Aug 29, 2012 at 10:50:12AM +0200, Alexandre DERUMIER wrote:
Nice results !
(can you make same benchmark from a qemu-kvm guest with virtio-driver ?
I have made some bench some month ago with stephan priebe, and we never be able to have more than 20000iops, with a full ssd 3nodes cluster)

How can I set the variables when the Journal data have go to the OSD ? (after X seconds and/or when Y %-full)
I think you can try to tune these values

filestore max sync interval = 30
filestore min sync interval = 29
filestore flusher = false
filestore queue max ops = 10000



----- Mail original -----

De: "Dieter Kasper" <d.kasper@xxxxxxxxxxxx>
À: ceph-devel@xxxxxxxxxxxxxxx
Cc: "Dieter Kasper (KD)" <d.kasper@xxxxxxxxxxxx>
Envoyé: Mardi 28 Août 2012 19:48:42
Objet: RBD performance - tuning hints

Hi,

on my 4-node system (SSD + 10GbE, see bench-config.txt for details)
I can observe a pretty nice rados bench performance
(see bench-rados.txt for details):

Bandwidth (MB/sec): 961.710
Max bandwidth (MB/sec): 1040
Min bandwidth (MB/sec): 772


Also the bandwidth performance generated with
fio --filename=/dev/rbd1 --direct=1 --rw=$io --bs=$bs --size=2G --iodepth=$threads --ioengine=libaio --runtime=60 --group_reporting --name=file1 --output=fio_${io}_${bs}_${threads}

.... is acceptable, e.g.
fio_write_4m_16 795 MB/s
fio_randwrite_8m_128 717 MB/s
fio_randwrite_8m_16 714 MB/s
fio_randwrite_2m_32 692 MB/s


But, the write IOPS seems to be limited around 19k ...
RBD 4M 64k (= optimal_io_size)
fio_randread_512_128 53286 55925
fio_randread_4k_128 51110 44382
fio_randread_8k_128 30854 29938
fio_randwrite_512_128 18888 2386
fio_randwrite_512_64 18844 2582
fio_randwrite_8k_64 17350 2445
(...)
fio_read_4k_128 10073 53151
fio_read_4k_64 9500 39757
fio_read_4k_32 9220 23650
(...)
fio_read_4k_16 9122 14322
fio_write_4k_128 2190 14306
fio_read_8k_32 706 13894
fio_write_4k_64 2197 12297
fio_write_8k_64 3563 11705
fio_write_8k_128 3444 11219


Any hints for tuning the IOPS (read and/or write) would be appreciated.

How can I set the variables when the Journal data have go to the OSD ? (after X seconds and/or when Y %-full)


Kind Regards,
-Dieter



--

--





Alexandre D e rumier

Ingénieur Systèmes et Réseaux


Fixe : 03 20 68 88 85

Fax : 03 20 68 90 88


45 Bvd du Général Leclerc 59100 Roubaix
12 rue Marivaux 75002 Paris
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html




--

--





Alexandre D e rumier

Ingénieur Systèmes et Réseaux


Fixe : 03 20 68 88 85

Fax : 03 20 68 90 88


45 Bvd du Général Leclerc 59100 Roubaix
12 rue Marivaux 75002 Paris




--

--





Alexandre D e rumier

Ingénieur Systèmes et Réseaux


Fixe : 03 20 68 88 85

Fax : 03 20 68 90 88


45 Bvd du Général Leclerc 59100 Roubaix
12 rue Marivaux 75002 Paris
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html





--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux