Yes, indeed. it seems not to matter much if you do nnot have a write intensive cluster. We have Intel 520s which were in production for over 2 years and only used 5% of their life according to smart. I've also used Samsung 840Pro, which had the same/similar figures over a year usage. So, I guess for my purpose, the endurance is not such a big deal. However, the ssds that I have absolutely suck performance wise for the ceph journal. Especially the Samsung drives. That's the main reason for wanting the 3700/3500 or their equivalent. Andrei ----- Original Message ----- > From: "Tyler Bishop" <tyler.bishop@xxxxxxxxxxxxxxxxx> > To: "Lionel Bouton" <lionel+ceph@xxxxxxxxxxx> > Cc: "Andrei Mikhailovsky" <andrei@xxxxxxxxxx>, "ceph-users" <ceph-users@xxxxxxxxxxxxxx> > Sent: Tuesday, 22 December, 2015 16:36:21 > Subject: Re: Intel S3710 400GB and Samsung PM863 480GB fio results > Write endurance is kinda bullshit. > > We have crucial 960gb drives storing data and we've only managed to take 2% off > the drives life in the period of a year and hundreds of tb written weekly. > > > Stuff is way more durable than anyone gives it credit. > > > ----- Original Message ----- > From: "Lionel Bouton" <lionel+ceph@xxxxxxxxxxx> > To: "Andrei Mikhailovsky" <andrei@xxxxxxxxxx>, "ceph-users" > <ceph-users@xxxxxxxxxxxxxx> > Sent: Tuesday, December 22, 2015 11:04:26 AM > Subject: Re: Intel S3710 400GB and Samsung PM863 480GB fio results > > Le 22/12/2015 13:43, Andrei Mikhailovsky a écrit : >> Hello guys, >> >> Was wondering if anyone has done testing on Samsung PM863 120 GB version to see >> how it performs? IMHO the 480GB version seems like a waste for the journal as >> you only need to have a small disk size to fit 3-4 osd journals. Unless you get >> a far greater durability. > > The problem is endurance. If we use the 480GB for 3 OSDs each on the > cluster we might build we expect 3 years (with some margin for error but > not including any write amplification at the SSD level) before the SSDs > will fail. > In our context a 120GB model might not even last a year (endurance is > 1/4th of the 480GB model). This is why SM863 models will probably be > more suitable if you have access to them: you can use smaller ones which > cost less and get more endurance (you'll have to check the performance > though, usually smaller models have lower IOPS and bandwidth). > >> I am planning to replace my current journal ssds over the next month or so and >> would like to find out if there is an a good alternative to the Intel's >> 3700/3500 series. > > 3700 are a safe bet (the 100GB model is rated for ~1.8PBW). 3500 models > probably don't have enough endurance for many Ceph clusters to be cost > effective. The 120GB model is only rated for 70TBW and you have to > consider both client writes and rebalance events. > I'm uneasy with SSDs expected to fail within the life of the system they > are in: you can have a cascade effect where an SSD failure brings down > several OSDs triggering a rebalance which might make SSDs installed at > the same time fail too. In this case in the best scenario you will reach > your min_size (>=2) and block any writes which would prevent more SSD > failures until you move journals to fresh SSDs. If min_size = 1 you > might actually lose data. > > If you expect to replace your current journal SSDs if I were you I would > make a staggered deployment over several months/a year to avoid them > failing at the same time in case of an unforeseen problem. In addition > this would allow to evaluate the performance and behavior of a new SSD > model with your hardware (there have been reports of performance > problems with some combinations of RAID controllers and SSD > models/firmware versions) without impacting your cluster's overall > performance too much. > > When using SSDs for journals you have to monitor both : > * the SSD wear leveling or something equivalent (SMART data may not be > available if you use a RAID controller but usually you can get the total > amount data written) of each SSD, > * the client writes on the whole cluster. > And check periodically what the expected lifespan left there is for each > of your SSD based on their current state, average write speed, estimated > write amplification (both due to pool's size parameter and the SSD > model's inherent write amplification) and the amount of data moved by > rebalance events you expect to happen. > Ideally you should make this computation before choosing the SSD models, > but several variables are not always easy to predict and probably will > change during the life of your cluster. > > Lionel > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com