On Tue, 10 May 2016 17:51:24 +0200 Yoann Moulin wrote: [snip] > >>>> Journal or cache Storage : 2 x SSD 400GB Intel S3300 DC (no Raid) > >>> > >>> These SSDs do not exist according to the Intel site and the only > >>> references I can find for them are on "no longer available" European > >>> sites. > >> > >> I made a mistake, it's not 400 but 480GB, smartctl give me Model > >> SSDSC2BB480H4 > >> > > OK, that's not good. > > Firstly, that model number still doesn't get us any hits from Intel, > > strangely enough. > > > > Secondly, it is 480GB (instead of 400, which denotes overprovisioning) > > and matches the 3510 480GB model up to the last 2 characters. > > And that has an endurance of 275TBW, not something you want to use for > > either journals or cache pools. > > I see, here the information from the resseler : > > "The S3300 series is the OEM version of S3510 and 1:1 the same drive" > Given the SMART output below, it seems to be 3500 based, but that doesn't change things. > >>> Without knowing the specifications for these SSDs, I can't recommend > >>> them. I'd use DC S3610 or 3710 instead, this very much depends on how > >>> much endurance (TPW) you need. > >> > >> As I write above, I already have those SSDs so I look for the best > >> setup with the hardware I have. > >> > > > > Unless they have at least an endurance of 3 DWPD like the 361x (and > > their model number, size and the 3300 naming suggests they do NOT), > > your 480GB SSDs aren't suited for intense Ceph usage. > > > > How much have you used them yet and what is their smartctl status, in > > particular these values (from a 800GB DC S3610 in my cache pool): > > --- > > 232 Available_Reservd_Space 0x0033 100 100 010 Pre-fail > > Always - 0 233 Media_Wearout_Indicator 0x0032 100 > > 100 000 Old_age Always - 0 241 > > Host_Writes_32MiB 0x0032 100 100 000 Old_age > > Always - 869293 242 Host_Reads_32MiB 0x0032 100 > > 100 000 Old_age Always - 43435 243 > > NAND_Writes_32MiB 0x0032 100 100 000 Old_age > > Always - 1300884 --- > > > > Not even 1% down after 40TBW, at which point your SSDs are likely to be > > 15% down... > > More or less the same value on the 10 hosts I have on my beta cluster : > > 232 Available_Reservd_Space 0x0033 100 100 010 Pre-fail Always - 0 > 233 Media_Wearout_Indicator 0x0032 100 100 000 Old_age Always - 0 > 241 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 233252 > 242 Total_LBAs_Read 0x0032 100 100 000 Old_age Always - 13 > >From the read count it's obvious that you used those as journals. ^.^ As I hinted above, if these were 3510 based they also should have the 243 attribute, as in my 3610 example. You may want to upgrade your smartctl and/or it's definition DB (on Debian that can be done with "update-smart-drivedb"). Intel's calculation of the media wearout always seems to be very fuzzy to me, given your 7TB written I'd expect it to be 98%, at least 99%. But then again a 200GB DC S3700 of mine has written 90TB out of 3650TB total and is at 99%, when I would expect it to be at 98%. Either way, those SSDs are designed for 275TBW (or 0.3 DWPD), and if they are used as journals they will expire quickly when those 100TB+ datasets get updated. They _might_ survive longer with a very carefully tuned cache tier (promote only really hot objects), but the risk of loosing SSDs there can be even higher than with journals. [snap] Regards, Christian -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Rakuten Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com