Using Ramdisk wi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 30 Jul 2014 10:50:02 -0400 German Anders wrote:

> Hi Christian,
>       How are you? Thanks a lot for the answers, mine in red.
> 
Most certainly not in red on my mail client...

> --- Original message ---
> >
> > Asunto: Re: [ceph-users] Using Ramdisk wi
> > De: Christian Balzer <chibi at gol.com>
> > Para: <ceph-users at lists.ceph.com>
> > Cc: German Anders <ganders at despegar.com>
> > Fecha: Wednesday, 30/07/2014 11:42
> >
> >
> > Hello,
> >
> > On Wed, 30 Jul 2014 09:55:49 -0400 German Anders wrote:
> >
> >>
> >> Hi Wido,
> >>
> >>              How are you? Thanks a lot for the quick response. I know 
> >> that is
> >> heavy cost on using ramdisk, but also i want to try that to see if i
> >> could get better performance, since I'm using a 10GbE network with the
> >> following configuration and i can't achieve more than 300MB/s of
> >> throughput on rbd:
> >>
> >
> > Testing the limits of Ceph with a ramdisk based journal to see what is
> > possible in terms of speed (and you will find that it is CPU/protocol
> > bound) is fine.
> > Anything resembling production is a big no-no.
> 
> Got it, did you try flashcache from facebook or dm-cache?

No.

> >
> >
> >
> >>
> >> MON Servers (3):
> >>              2x Intel Xeon E3-1270v3 @3.5Ghz (8C)
> >>              32GB RAM
> >>              2x SSD Intel 120G in RAID1 for OS
> >>              1x 10GbE port
> >>
> >> OSD Servers (4):
> >>              2x Intel Xeon E5-2609v2 @2.5Ghz (8C)
> >>              64GB RAM
> >>              2x SSD Intel 120G in RAID1 for OS
> >>              3x SSD Intel 120G for Journals (3 SAS disks: 1 SSD 
> >> Journal)
> >
> > You're not telling us WHICH actual Intel SSDs you're using.
> > If those are DC3500 ones, then 300MB/s totoal isn't a big surprise at 
> > all,
> > as they are capable of 135MB/s writes at most.
> 
> The SSD model is Intel SSDSC2BB120G4 firm D2010370

That's not really an answer, but then again Intel could have chosen model
numbers that resemble their product names.

That is indeed a DC 3500, so my argument stands.
With those SSDs for your journals, much more than 300MB/s per node is
simply not possible, never mind how fast or slow the HDDs perform.

> >
> >
> >
> >>
> >>              9x SAS 3TB 6G for OSD
> > That would be somewhere over 1GB/s in theory, but give file system and
> > other overheads (what is your replication level?) that's a very
> > theoretical value indeed.
> 
> The RF is 2, so perf should be much better, also notice that read perf 
> is really poor, around 62MB/s...
> 
A replication factor of 2 means that each write is amplified by 2.
So half of your theoretical performance is gone already.

Do your tests with atop or iostat running on all storage nodes. 
Determine where the bottleneck is, the journals SSDs or the HDDs or
(unlikely) something else.

Read performance sucks balls with RBD (at least individually), it can be
improved by fondling the readahead value. See:

http://permalink.gmane.org/gmane.comp.file-systems.ceph.user/8817

This is something the Ceph developers are aware of and hopefully will
address in the future:
https://wiki.ceph.com/Planning/Blueprints/Emperor/Kernel_client_read_ahead_optimization

Christian

> >
> >
> >
> > Christian
> >
> >>
> >>              2x 10GbE port (1 for Cluster Network, 1 for Public 
> >> Network)
> >>
> >> - 10GbE Switches (1 for Cluster interconnect and 1 for Public network)
> >> - Using Ceph Firefly version 0.80.4.
> >>
> >>              The thing is that with fio, rados bench and vdbench
> >> tools we
> >> only see 300MB/s on writes (rand and seq) with bs of 4m and 16
> >> threads, that's pretty low actually, yesterday i was talking in the
> >> ceph irc and i hit with the presentation that someone from Fujitsu do
> >> on Frankfurt and also with some mails with some config at 10GbE  and
> >> he achieve almost 795MB/s and more... i would like to know if possible
> >> how to implement that so we could improve our ceph cluster a little
> >> bit more, i actually configure the scheduler on the SSD's disks both
> >> OS and Journal to [noop] but still didn't notice any improvement.
> >> That's why we would like to try RAMDISK on Journals, i've noticed that
> >> he implement that on their Ceph cluster.
> >>
> >> I will really appreciate the help on this. Also if you need me to send
> >> you some more information about the  Ceph scheme please let me know.
> >> Also if someone could share some detail conf info will really help!
> >>
> >> Thanks a lot,
> >>
> >>
> >> German Anders
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>>
> >>> --- Original message ---
> >>> Asunto: Re: [ceph-users] Using Ramdisk wi
> >>> De: Wido den Hollander <wido at 42on.com>
> >>> Para: <ceph-users at lists.ceph.com>
> >>> Fecha: Wednesday, 30/07/2014 10:34
> >>>
> >>> On 07/30/2014 03:28 PM, German Anders wrote:
> >>>>
> >>>>
> >>>> Hi Everyone,
> >>>>
> >>>>                                Anybody is using ramdisk to put the 
> >>>> Journal on it? If
> >>>> so, could
> >>>> you please share the commands to implement that? since I'm having
> >>>> some issues with that and want to test that out to see if i could
> >>>> get better
> >>>> performance.
> >>>
> >>> Don't do this. When you loose the journal, you loose the OSD. So a
> >>> reboot of the machine effectively trashes the data on that OSD.
> >>>
> >>> Wido
> >>>
> >>>>
> >>>>
> >>>>
> >>>> Thanks in advance,
> >>>>
> >>>> *German Anders
> >>>> *
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> ceph-users mailing list
> >>>> ceph-users at lists.ceph.com
> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>>>
> >>>
> >>>
> >>> --
> >>> Wido den Hollander
> >>> 42on B.V.
> >>> Ceph trainer and consultant
> >>>
> >>> Phone: +31 (0)20 700 9902
> >>> Skype: contact42on
> >>> _______________________________________________
> >>> ceph-users mailing list
> >>> ceph-users at lists.ceph.com
> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>
> >
> >
> > --
> > Christian Balzer        Network/Systems Engineer
> > chibi at gol.com   	Global OnLine Japan/Fusion Communications
> > http://www.gol.com/
> 


-- 
Christian Balzer        Network/Systems Engineer                
chibi at gol.com   	Global OnLine Japan/Fusion Communications
http://www.gol.com/


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux