On Mon, Jul 22, 2013 at 08:45:07AM +1100, Mikaël Cluseau wrote: > On 22/07/2013 08:03, Charles 'Boyo wrote: > >Counting on the kernel's cache, it appears I will be best served > >purchasing write-optimized SSDs? > >Can you share any information on the SSD you are using, is it PCIe > >connected? > > We are on a standard SAS bus so any SSD going to 500MB/s and being > stable on the long run (we use 60G Intel 520), you do not need a lot > of space for the journal (5G per drive is far enough on commodity > hardware). > > >Another question, since the intention of this storage cluster is > >relatively cheap storage on commodity hardware, what's the balance > >between cheap SSDs and reliability since journal failure might > >result in data loss or will such an event just 'down' the affected > >OSDs? > When you do a write to Ceph, one OSD (I believe this is the master for a certain part of the data, an object) receives the write and distributed the copies to other OSD (as much as is configured, like: min size=2 size=3) when writes are done on all those OSDs it will confirm the write to the client. So if one OSD failes, other OSDs will have that data. The master will have to make sure an other copy is created somewhere else. So I don't see a reason for data loss if you lose one journal. There will be a lot of copying of data though and slow things down. > A journal failure will fail your OSDs (from what I've understood, > you'll have to rebuild them). But SSDs are very deterministic, so > monitor them : > > # smartctl -A /dev/sdd > [..] > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE > UPDATED WHEN_FAILED RAW_VALUE > [..] > 232 Available_Reservd_Space 0x0033 100 100 010 Pre-fail > Always - 0 > 233 Media_Wearout_Indicator 0x0032 093 093 000 Old_age > Always - 0 > > And don't put too many OSDs on one SSD (I set a rule to not go over > 4 for 1). > When the SSD is large enough and yournals don't take up all the space, you can also leave part of the SSD unpartitioned. This will allow the SSD the fail much later. > >On a similar note, I am using XFS on the OSDs which also journals, > >does this affect performance in any way? > > You want this journal for consistency ;) I don't know exactly the > impact, but since we use spinning drives, the most important factor > is that ceph, with a journal on SSD, does a lot of sequential > writes, avoiding most seeks. > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com