Re: Real world benefit from SSD Journals for a more read than write cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Thu, 10 Mar 2016 22:25:10 -0500 Alex Gorbachev wrote:

> Reviving an old thread:
> 
> On Sunday, July 12, 2015, Lionel Bouton <lionel+ceph@xxxxxxxxxxx> wrote:
> 
> > On 07/12/15 05:55, Alex Gorbachev wrote:
> > > FWIW. Based on the excellent research by Mark Nelson
> > > (
> > http://ceph.com/community/ceph-performance-part-2-write-throughput-without-ssd-journals/
> > )
> > > we have dropped SSD journals altogether, and instead went for the
> > > battery protected controller writeback cache.
> >
> > Note that this has limitations (and the research is nearly 2 years
> > old):
> > - the controller writeback caches are relatively small (often less than
> > 4GB, 2GB is common on the controller, a small portion is not usable,
> > and 10% of the rest is often used for readahead/read cache) and this is
> > shared by all of your drives. If your workload is not "write spikes"
> > oriented, but nearly constant writes this won't help as you will be
> > limited on each OSD by roughly half of the disk IOPS. With journals on
> > SSDs when you hit their limit (which is ~5GB of buffer for 10GB
> > journals and not <2GB divided by the amount of OSDs per controller),
> > the limit is the raw disk IOPS.
> > - you *must* make sure the controller is configured to switch to
> > write-through when the battery/capacitor fails (or a power failure on
> > hardware from the same generation could make you lose all of the OSDs
> > connected to them in a single event which means data loss),
> > - you should monitor the battery/capacitor status to trigger
> > maintenance (and your cluster will slow down while the
> > battery/capacitor is waiting for a replacement, you might want to down
> > the associated OSDs depending on your cluster configuration). We
> > mostly eliminated this problem by replacing the whole chassis of the
> > servers we lease for new generations every 2 or 3 years: if you time
> > the hardware replacement to match a fresh chassis generation this
> > means fresh capacitors and they shouldn't fail you (ours are rated for
> > 3 years).
> >
> > We just ordered Intel S3710 SSDs even though we have battery/capacitor
> > backed caches on the controllers: the latencies have started to rise
> > nevertheless when there are long periods of write intensive activity.
> > I'm currently pondering if we should bypass the write-cache for the
> > SSDs. The cache is obviously less effective on them and might be more
> > useful overall if it is dedicated to the rotating disks. Does anyone
> > have test results with cache active/inactive on SSD journals with HP
> > Smart Array p420 or p840 controllers?
> 
> 
> We have come to the same conclusion once we started seeing some more
> constant write loads. Thank you for the great info - question: have you
> tried SSD journals with and without additional controller cache?  Any
> benefit?
>
Haven't tried that with journals SSDs, simply because I tend to use
basically DC S3700s there, which would benefit little considering the cost
of a fast enough controller with ample cache.

That said, I've done this both with HDDs and on disk journals (with the
expected results as detailed above) and with consumer Intel 530 SSDs on
some Twin servers that came with LSI 2108 controllers.

In the later case these are OS disk, nothing Ceph related. 
But the HW controller cache nicely masks the garbage collection spikes and
slowness of SYNC writes of these SSDs in medium load scenarios. 

In short, HW cache should always help, but it can do only so much (for so
long) so unless you already have HW with it or can get it dirt cheap, it's
not particular economic once you reach its limits.

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux