Re: Do you see a data loss if a SSD hosting several OSD journals crashes

Christian Balzer <chibi@xxxxxxx> · Fri, 20 May 2016 13:11:58 +0900

Hello,

On Fri, 20 May 2016 03:44:52 +0000 EP Komarla wrote:

> Thanks Christian.  Point noted.  Going forward I will write text to make
> it easy to read.
> 
> Thanks for your response.  Losing a journal drive seems expensive as I
> will have to rebuild 5 OSDs in this eventuality.
>
Potentially, there are ways to avoid a full rebuild, but that depends on
some factors and is pretty advanced stuff.

It's expensive, but as Dyweni wrote an expected situation your cluster
should be able to handle. 

The chances of loosing a journal SSD unexpectedly are of course going be
very small if you choose the right type of SSD, Intel DC 37xx or at least
36xx for example.

Christian

> - epk
> 
> -----Original Message-----
> From: Christian Balzer [mailto:chibi@xxxxxxx] 
> Sent: Thursday, May 19, 2016 7:00 PM
> To: ceph-users@xxxxxxxxxxxxxx
> Cc: EP Komarla <Ep.Komarla@xxxxxxxxxxxxxxx>
> Subject: Re:  Do you see a data loss if a SSD hosting
> several OSD journals crashes
> 
> 
> Hello,
> 
> first of all, wall of text. Don't do that. 
> Use returns and paragraphs liberally to make reading easy.
> I'm betting at least half of the people who could have answered you
> question took a look at this blob of text and ignored it.
> 
> Secondly, search engines are your friend.
> The first hit when googling for "ceph ssd journal failure" is this gem:
> http://ceph.com/planet/ceph-recover-osds-after-ssd-journal-failure/
> 
> Loosing a journal SSD will at most cost you the data on all associated
> OSDs and thus the recovery/backfill traffic, if you don't feel like
> doing what the link above describes.
> 
> Ceph will not acknowledge a client write before all journals (replica
> size, 3 by default) have received the data, so loosing one journal SSD
> will NEVER result in an actual data loss.
> 
> Christian
> 
> On Fri, 20 May 2016 01:38:08 +0000 EP Komarla wrote:
> 
> >   *   We are trying to assess if we are going to see a data loss if an
> > SSD that is hosting journals for few OSDs crashes. In our 
> > configuration, each SSD is partitioned into 5 chunks and each chunk is 
> > mapped as a journal drive for one OSD. What I understand from the Ceph
> > documentation: "Consistency: Ceph OSD Daemons require a filesystem 
> > interface that guarantees atomic compound operations. Ceph OSD Daemons 
> > write a description of the operation to the journal and apply the 
> > operation to the filesystem. This enables atomic updates to an object 
> > (for example, placement group metadata). Every few seconds-between 
> > filestore max sync interval and filestore min sync interval-the Ceph 
> > OSD Daemon stops writes and synchronizes the journal with the 
> > filesystem, allowing Ceph OSD Daemons to trim operations from the 
> > journal and reuse the space. On failure, Ceph OSD Daemons replay the 
> > journal starting after the last synchronization operation." So, my 
> > question is what happens if an SSD fails - am I going to lose all the 
> > data that has not been written/synchronized to OSD?  In my case, am I 
> > going to lose data for all the 5 OSDs which can be bad?  This is of 
> > concern to us. What are the options to prevent any data loss at all?  
> > Is it better to have the journals on the same hard drive, i.e., to 
> > have one journal per OSD and host it on the same hard drive?  Of 
> > course, performance will not be as good as having an SSD for OSD 
> > journal. In this case, I am thinking I will not lose data as there are 
> > secondary OSDs where data is replicated (we are using triple 
> > replication).  Any thoughts?  What other solutions people have adopted 
> > for data reliability and consistency to address the case I am
> > mentioning?
> > 
> > 
> > 
> > Legal Disclaimer:
> > The information contained in this message may be privileged and 
> > confidential. It is intended to be read only by the individual or 
> > entity to whom it is addressed or by their designee. If the reader of 
> > this message is not the intended recipient, you are on notice that any 
> > distribution of this message, in any form, is strictly prohibited. If 
> > you have received this message in error, please immediately notify the 
> > sender and delete or destroy any copy of this message!
> 
> 

-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com