On 07/12/15 05:55, Alex Gorbachev wrote: > FWIW. Based on the excellent research by Mark Nelson > (http://ceph.com/community/ceph-performance-part-2-write-throughput-without-ssd-journals/) > we have dropped SSD journals altogether, and instead went for the > battery protected controller writeback cache. Note that this has limitations (and the research is nearly 2 years old): - the controller writeback caches are relatively small (often less than 4GB, 2GB is common on the controller, a small portion is not usable, and 10% of the rest is often used for readahead/read cache) and this is shared by all of your drives. If your workload is not "write spikes" oriented, but nearly constant writes this won't help as you will be limited on each OSD by roughly half of the disk IOPS. With journals on SSDs when you hit their limit (which is ~5GB of buffer for 10GB journals and not <2GB divided by the amount of OSDs per controller), the limit is the raw disk IOPS. - you *must* make sure the controller is configured to switch to write-through when the battery/capacitor fails (or a power failure on hardware from the same generation could make you lose all of the OSDs connected to them in a single event which means data loss), - you should monitor the battery/capacitor status to trigger maintenance (and your cluster will slow down while the battery/capacitor is waiting for a replacement, you might want to down the associated OSDs depending on your cluster configuration). We mostly eliminated this problem by replacing the whole chassis of the servers we lease for new generations every 2 or 3 years: if you time the hardware replacement to match a fresh chassis generation this means fresh capacitors and they shouldn't fail you (ours are rated for 3 years). We just ordered Intel S3710 SSDs even though we have battery/capacitor backed caches on the controllers: the latencies have started to rise nevertheless when there are long periods of write intensive activity. I'm currently pondering if we should bypass the write-cache for the SSDs. The cache is obviously less effective on them and might be more useful overall if it is dedicated to the rotating disks. Does anyone have test results with cache active/inactive on SSD journals with HP Smart Array p420 or p840 controllers? Lionel _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com