Re: Tip of the week: don't use Intel 530 SSD's for journals

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have suffered power losses in every data center I've been in.  I have lost SSDs because of it (Intel 320 Series).  The worst time, I lost both SSDs in a RAID1.  That was a bad day.

I'm using the Intel DC S3700 now, so I don't have a repeat.  My cluster is small enough that losing a journal SSD would be a major headache.

I'm manually monitoring wear level.  So far all of my journals are still at 100% lifetime.  I do have some of the Intel 320 that are down to 45% lifetime remaining.  (Those Intel 320s are in less critical roles).  One of these days I'll get around to automating it.


Speed wise, my small cluster was fast enough without SSDs, until I started to expand.  I'm only using RadosGW, and I only care about latency in the human timeframe.  A second or two of latency is annoying, but not a big deal.  

I went from 3 nodes to 5, and the expansion was extremely painful.  I admit that I inflicted a lot of pain on myself.  I expanded too fast (add all the OSDs at the same time?  Sure, why not.), and I was using the default configs.  Things got better after I lowered the backfill priority and count, and learned to add one or two disks at a time.  Still, customers noticed the increase in latency when I was adding osds.

Now that I have the journals on SSDs, customers don't notice the maintenance anymore.  RadosGW latency goes from ~50ms to ~80ms, not ~50ms to 2000ms.



On Tue, Nov 25, 2014 at 9:12 AM, Michael Kuriger <mk7193@xxxxxx> wrote:
My cluster is actually very fast without SSD drives.  Thanks for the
advice!

Michael Kuriger
mk7193@xxxxxx
818-649-7235

MikeKuriger (IM)




On 11/25/14, 7:49 AM, "Mark Nelson" <mark.nelson@xxxxxxxxxxx> wrote:

>On 11/25/2014 09:41 AM, Erik Logtenberg wrote:
>> If you are like me, you have the journals for your OSD's with rotating
>> media stored separately on an SSD. If you are even more like me, you
>> happen to use Intel 530 SSD's in some of your hosts. If so, please do
>> check your S.M.A.R.T. statistics regularly, because these SSD's really
>> can't cope with Ceph.
>>
>> Check out the media-wear graphs for the two Intel 530's in my cluster.
>> As soon as those declining lines get down to 30% or so, they need to be
>> replaced. That means less than half a year between purchase and
>> end-of-life :(
>>
>> Tip of the week, keep an eye on those statistics, don't let a failing
>> SSD surprise you.
>
>This is really good advice, and it's not just the Intel 530s.  Most
>consumer grade SSDs have pretty low write endurance.  If you mostly are
>doing reads from your cluster you may be OK, but if you have even
>moderately high write workloads and you care about avoiding OSD downtime
>(which in a production cluster is pretty important though not usually
>100% critical), get high write endurance SSDs.
>
>Mark
>
>>
>> Erik.
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>>
>>https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listin
>>fo.cgi_ceph-2Dusers-2Dceph.com&d=AAICAg&c=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOS
>>ncm6Vfn0C_UQ&r=CSYA9OS6Qd7fQySI2LDvlQ&m=xAjtZHPapVvnusxPYRk6BsgVfaL1ZLDaT
>>ojJmuDFDpQ&s=F0CBA8T3LuTIhofIV4LGk-6CgC8KsPAu-7JgJ4jRm3I&e=
>>
>
>_______________________________________________
>ceph-users mailing list
>ceph-users@xxxxxxxxxxxxxx
>https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinf
>o.cgi_ceph-2Dusers-2Dceph.com&d=AAICAg&c=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSnc
>m6Vfn0C_UQ&r=CSYA9OS6Qd7fQySI2LDvlQ&m=xAjtZHPapVvnusxPYRk6BsgVfaL1ZLDaTojJ
>muDFDpQ&s=F0CBA8T3LuTIhofIV4LGk-6CgC8KsPAu-7JgJ4jRm3I&e=

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux