Re: Multiple journals and an OSD on one SSD doable?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Just used the method in the link you sent me to test one of the EVO 850s, with one job it reached a speed of around 2.5MB/s but it didn't max out until around 32 jobs at 24MB/s:

sudo fio --filename=/dev/sdh --direct=1 --sync=1 --rw=write --bs=4k --numjobs=32 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test
write: io=1507.4MB, bw=25723KB/s, iops=6430, runt= 60007msec

Also tested a Micron 550 we had sitting around and it maxed out at 2.5mb/s, both results conflict with the chart

Regards,

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085
Email  cameron.scrace@xxxxxxxxxxxx
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz



From:        Christian Balzer <chibi@xxxxxxx>
To:        "ceph-users@xxxxxxxx" <ceph-users@xxxxxxxx>
Cc:        Cameron.Scrace@xxxxxxxxxxxx
Date:        08/06/2015 02:40 p.m.
Subject:        Re: [ceph-users] Multiple journals and an OSD on one SSD doable?




On Mon, 8 Jun 2015 14:30:17 +1200 Cameron.Scrace@xxxxxxxxxxxx wrote:

> Thanks for all the feedback.
>
> What makes the EVOs unusable? They should have plenty of speed but your
> link has them at 1.9MB/s, is it just the way they handle O_DIRECT and
> D_SYNC?
>
Precisely.
Read that ML thread for details.

And once more, they also are not very endurable.
So depending on your usage pattern and Ceph (Ceph itself and the
underlying FS) write amplification their TBW/$ will be horrible, costing
you more in the end than more expensive, but an order of magnitude more
endurable DC SSDs.

> Not sure if we will be able to spend anymore, we may just have to take
> the performance hit until we can get more money for the project.
>
You could cheap out with 200GB DC S3700s (half the price), but they will
definitely become the bottleneck at a combined max speed of about 700MB/s,
as opposed to the 400GB ones at 900MB/s combined.

Christian

> Thanks,
>
> Cameron Scrace
> Infrastructure Engineer
>
> Mobile +64 22 610 4629
> Phone  +64 4 462 5085
> Email  cameron.scrace@xxxxxxxxxxxx
> Solnet Solutions Limited
> Level 12, Solnet House
> 70 The Terrace, Wellington 6011
> PO Box 397, Wellington 6140
>
>
www.solnet.co.nz
>
>
>
> From:   Christian Balzer <chibi@xxxxxxx>
> To:     "ceph-users@xxxxxxxx" <ceph-users@xxxxxxxx>
> Cc:     Cameron.Scrace@xxxxxxxxxxxx
> Date:   08/06/2015 02:00 p.m.
> Subject:        Re: [ceph-users] Multiple journals and an OSD on one SSD
> doable?
>
>
>
>
> Cameron,
>
> To offer at least some constructive advice here instead of just all doom
> and gloom, here's what I'd do:
>
> Replace the OS SSDs with 2 400GB Intel DC S3700s (or S3710s).
> They have enough BW to nearly saturate your network.
>
> Put all your journals on them (3 SSD OSD and 3 HDD OSD per).
> While that's a bad move from a failure domain perspective, your budget
> probably won't allow for anything better and those are VERY reliable and
> just as important durable SSDs.
>
> This will give you the speed your current setup is capable of, probably
> limited by the CPU when it comes to SSD pool operations.
>
> Christian
>
> On Mon, 8 Jun 2015 10:44:06 +0900 Christian Balzer wrote:
>
> >
> > Hello Cameron,
> >
> > On Mon, 8 Jun 2015 13:13:33 +1200 Cameron.Scrace@xxxxxxxxxxxx wrote:
> >
> > > Hi Christian,
> > >
> > > Yes we have purchased all our hardware, was very hard to convince
> > > management/finance to approve it, so some of the stuff we have is a
> > > bit cheap.
> > >
> > Unfortunate. Both the done deal and the cheapness.
> >
> > > We have four storage nodes each with 6 x 6TB Western Digital Red
> > > SATA Drives (WD60EFRX-68M) and 6 x 1TB Samsung EVO 850s SSDs and
> > > 2x250GB Samsung EVO 850s (for OS raid).
> > > CPUs are Intel Atom C2750  @ 2.40GHz (8 Cores) with 32 GB of RAM.
> > > We have a 10Gig Network.
> > >
> > I wish there was a nice way to say this, but it unfortunately boils
> > down to a "You're fooked".
> >
> > There have been many discussions about which SSDs are usable with Ceph,
> > very recently as well.
> > Samsung EVOs (the non DC type for sure) are basically unusable for
> > journals. See the recent thread:
> >  Possible improvements for a slow write speed (excluding independent
> > SSD journals) and:
> >
>
http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
>
> > for reference.
> >
> > I presume your intention for the 1TB SSDs is for a SSD backed pool?
> > Note that the EVOs have a pretty low (guaranteed) endurance, so aside
> > from needing journal SSDs that actually can do the job, you're looking
> at
> > wearing them out rather quickly (depending on your use case of course).
> >
> > Now with SSD based OSDs or even HDD based OSDs with SSD journals that
> CPU
> > looks a bit anemic.
> >
> > More below:
> > > The two options we are considering are:
> > >
> > > 1) Use two of the 1TB SSDs for the spinning disk journals (3 each)
> > > and
>
> > > then use the remaining 900+GB of each drive as an OSD to be part of
> > > the cache pool.
> > >
> > > 2) Put the spinning disk journals on the OS SSDs and use the 2 1TB
> > > SSDs for the cache pool.
> > >
> > Cache pools aren't all that speedy currently (research the ML
> > archives), even less so with the SSDs you have.
> >
> > Christian
> >
> > > In both cases the other 4 1TB SSDs will be part of their own tier.
> > >
> > > Thanks a lot!
> > >
> > > Cameron Scrace
> > > Infrastructure Engineer
> > >
> > > Mobile +64 22 610 4629
> > > Phone  +64 4 462 5085
> > > Email  cameron.scrace@xxxxxxxxxxxx
> > > Solnet Solutions Limited
> > > Level 12, Solnet House
> > > 70 The Terrace, Wellington 6011
> > > PO Box 397, Wellington 6140
> > >
> > >
www.solnet.co.nz
> > >
> > >
> > >
> > > From:   Christian Balzer <chibi@xxxxxxx>
> > > To:     "ceph-users@xxxxxxxx" <ceph-users@xxxxxxxx>
> > > Cc:     Cameron.Scrace@xxxxxxxxxxxx
> > > Date:   08/06/2015 12:18 p.m.
> > > Subject:        Re: [ceph-users] Multiple journals and an OSD on one
> > > SSD doable?
> > >
> > >
> > >
> > >
> > > Hello,
> > >
> > >
> > > On Mon, 8 Jun 2015 09:55:56 +1200 Cameron.Scrace@xxxxxxxxxxxx wrote:
> > >
> > > > The other option we were considering was putting the journals on
> > > > the OS SSDs, they are only 250GB and the rest would be for the OS.
> > > > Is that a decent option?
> > > >
> > > You'll be getting a LOT better advice if you're telling us more
> > > details.
> > >
> > > For starters, have you bought the hardware yet?
> > > Tell us about your design, how many initial storage nodes, how many
> > > HDDs/SSDs per node, what CPUs/RAM/network?
> > >
> > > What SSDs are we talking about, exact models please.
> > > (Both the sizes you mentioned do not ring a bell for DC level SSDs
> > > I'm aware of)
> > >
> > > That said, I'm using Intel DC S3700s for mixed OS and journal use
> > > with
>
> > > good
> > > results.
> > > In your average Ceph storage node, normal OS (logging mostly)
> > > activity is a
> > > minute drop in the bucket for any decent SSD, so nearly all of it's
> > > resources are available to journals.
> > >
> > > You want to match the number of journals per SSD according to the
> > > capabilities of your SSD, HDDs and network.
> > >
> > > For example 8 HDD OSDs with 2 200GB DC S3700 and a 10Gb/s network is
> > > a decent match.
> > > The two SSDs at 900MB/s would appear to be the bottleneck, but in
> > > reality I'd expect the HDDs to be it.
> > > Never mind that you'd be more likely to be IOPS than bandwidth bound.
> > >
> > > Regards,
> > >
> > > Christian
> > >
> > > > Thanks!
> > > >
> > > > Cameron Scrace
> > > > Infrastructure Engineer
> > > >
> > > > Mobile +64 22 610 4629
> > > > Phone  +64 4 462 5085
> > > > Email  cameron.scrace@xxxxxxxxxxxx
> > > > Solnet Solutions Limited
> > > > Level 12, Solnet House
> > > > 70 The Terrace, Wellington 6011
> > > > PO Box 397, Wellington 6140
> > > >
> > > >
www.solnet.co.nz
> > > >
> > > >
> > > >
> > > > From:   Somnath Roy <Somnath.Roy@xxxxxxxxxxx>
> > > > To:     "Cameron.Scrace@xxxxxxxxxxxx"
> > > > <Cameron.Scrace@xxxxxxxxxxxx>,
>
> > > > "ceph-users@xxxxxxxx" <ceph-users@xxxxxxxx>
> > > > Date:   08/06/2015 09:34 a.m.
> > > > Subject:        RE: [ceph-users] Multiple journals and an OSD on
> > > > one SSD
> > >
> > > > doable?
> > > >
> > > >
> > > >
> > > > Cameron,
> > > > Generally, it’s not a good idea.
> > > > You want to protect your SSDs used as journal.If any problem on
> > > > that disk, you will be losing all of your dependent OSDs.
> > > > I don’t think a bigger journal will gain you much performance ,
> > > > so, default 5 GB journal size should be good enough. If you want to
> > > > reduce the fault domain and want to put 3 journals on a SSD , go
> > > > for minimum size and high endurance SSDs for that.
> > > > Now, if you want to use your rest of space of 1 TB ssd, creating
> just
> > > > OSDs will not gain you much (rather may get some burst
> > > > performance). You may want to consider the following.
> > > >
> > > > 1. If your spindle OSD size is much bigger than 900 GB , you don’t
> > > > want to make all OSDs of similar sizes, cache pool could be one of
> > > > your option. But, remember, cache pool can wear out your SSDs
> > > > faster as presently I guess it is not optimizing the extra writes.
> > > > Sorry, I don’t have exact data as I am yet to test that out.
> > > >
> > > > 2. If you want to make all the OSDs of similar sizes and you will
> > > > be able to create a substantial number of OSDs with your unused
> > > > SSDs (depends on how big the cluster is), you may want to put all
> > > > of your primary OSDs to SSD and gain significant performance boost
> > > > for read. Also, in this case, I don’t think you will be getting
> > > > any burst performance.
> > > > Thanks & Regards
> > > > Somnath
> > > >
> > > > From: ceph-users [
mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On
> Behalf
> > > > Of
> > >
> > > > Cameron.Scrace@xxxxxxxxxxxx
> > > > Sent: Sunday, June 07, 2015 1:49 PM
> > > > To: ceph-users@xxxxxxxx
> > > > Subject: [ceph-users] Multiple journals and an OSD on one SSD
> doable?
> > > >
> > > > Setting up a Ceph cluster and we want the journals for our spinning
> > > > disks to be on SSDs but all of our SSDs are 1TB. We were planning
> > > > on putting 3 journals on each SSD, but that leaves 900+GB unused
> > > > on the drive, is it possible to use the leftover space as another
> > > > OSD or will it affect performance too much?
> > > >
> > > > Thanks,
> > > >
> > > > Cameron Scrace
> > > > Infrastructure Engineer
> > > >
> > > > Mobile +64 22 610 4629
> > > > Phone  +64 4 462 5085
> > > > Email  cameron.scrace@xxxxxxxxxxxx
> > > > Solnet Solutions Limited
> > > > Level 12, Solnet House
> > > > 70 The Terrace, Wellington 6011
> > > > PO Box 397, Wellington 6140
> > > >
> > > >
www.solnet.co.nzAttention: This email may contain information
> > > > intended for the sole use of the original recipient. Please respect
> > > > this when sharing or disclosing this email's contents with any
> > > > third party. If you believe you have received this email in error,
> > > > please delete it and notify the sender or
> > > > postmaster@xxxxxxxxxxxxxxxxxxxxx as soon as possible. The content
> > > > of this email does not necessarily reflect the views of Solnet
> > > > Solutions Ltd.
> > > >
> > > >
> > > > PLEASE NOTE: The information contained in this electronic mail
> > > > message is intended only for the use of the designated recipient(s)
> > > > named above. If the reader of this message is not the intended
> > > > recipient, you are hereby notified that you have received this
> > > > message in error and that any review, dissemination, distribution,
> > > > or copying of this message is strictly prohibited. If you have
> > > > received this communication in error, please notify the sender by
> > > > telephone or e-mail (as shown above) immediately and destroy any
> > > > and all copies of this message in your possession (whether hard
> > > > copies or electronically stored copies).
> > > >
> > > >
> > > >
> > > > Attention:
> > > > This email may contain information intended for the sole use of
> > > > the original recipient. Please respect this when sharing or
> > > > disclosing this email's contents with any third party. If you
> > > > believe you have received this email in error, please delete it
> > > > and notify the sender or postmaster@xxxxxxxxxxxxxxxxxxxxx as
> > > > soon as possible. The content of this email does not necessarily
> > > > reflect the views of Solnet Solutions Ltd.
> > > >
> > >
> > >
> >
> >
>
>


--
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx                    Global OnLine Japan/Fusion Communications
http://www.gol.com/

Attention: This email may contain information intended for the sole use of the original recipient. Please respect this when sharing or disclosing this email's contents with any third party. If you believe you have received this email in error, please delete it and notify the sender or postmaster@xxxxxxxxxxxxxxxxxxxxx as soon as possible. The content of this email does not necessarily reflect the views of Solnet Solutions Ltd.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux