On 07/16/2014 09:58 AM, Riccardo Murri wrote: > Hello, > > I am new to Ceph; the group I'm working in is currently evaluating it > for our new large-scale storage. > > Is there any recommendation for the OSD journals? E.g., does it make > sense to keep them on SSDs? Would it make sense to host the journal > on a RAID-1 array for added safety? (IOW: what happens if the journal > device fails and the journal is lost?) > > Thanks for any explanation and suggestion! Hi, There are a couple of common configurations that make sense imho: 1) Leave journals on the same disks as the data (best to have them in their own partition). This is a fairly safe option since the OSDs only have a single disk they rely on (IE minimize potential failures). It can be slow, but it depends on the controller you use and possibly the IO scheduler. Often times a controller with writeback cache seems to help avoid seek contention during writes, but you will currently lose about half your disk throughput to journal writes during sequential write IO. 2) Put journals on SSDs. In this scenario you want to match your per journal SSD speed and disk speed. IE if you have an SSD that can do 400MB/s and disks that can do ~125MB/s of sequential writes, you probably want to put somewhere around 3-5 journals on the SSD depending on how much sequential write throughput matters to you. OSDs are now dependant on both the spinning disk and the SSD not to fail, and one SSD failure will take down multiple OSDs. You gain speed though and may not need more expensive controllers with WB cache (though they may still be useful to protect against power failure). Some folks have used raid-1 LUNs for the journals and it works fine, but I'm not really a fan of it, especially with SSDs. You are causing double the writes to the SSDs, and SSDs tend to fail in clumps based on the number of writes. If the choice is between 6 journals per SSD RAID-1 or 3 journals per SSD JBOD, I'd choose the later. I'd want to keep my overall OSD count high though to minimize the fallout from 3 OSDs going down at once. Arguably if you do the RAID1, can swap failed SSDs quickly, and anticipate that the remaining SSD is likely going to die soon after the first, maybe the RAID1 is worth it. The disadvantages seem pretty steep to me though. Mark > > Riccardo > _______________________________________________ > ceph-users mailing list > ceph-users at lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >