Re: Surviving a ceph cluster outage: the hard way

kefu chai <tchaikov@xxxxxxxxx> · Fri, 28 Oct 2016 11:55:44 +0800

On Thu, Oct 27, 2016 at 1:26 PM, Kostis Fardelas <dante1234@xxxxxxxxx> wrote:
> It is not more than a three line script. You will also need leveldb's
> code in your working directory:
>
> ```
> #!/usr/bin/python2
>
> import leveldb
> leveldb.RepairDB('./omap')
> ```
>
> I totally agree that we need more repair tools to be officially
> available and also tools that provide better insight to components
> that are a "black box" for the operator right now ie the journal
>
> On 24 October 2016 at 19:36, Dan Jakubiec <dan.jakubiec@xxxxxxxxx> wrote:
>> Thanks Kostis, great read.
>>
>> We also had a Ceph disaster back in August and a lot of this experience looked familiar.  Sadly, in the end we were not able to recover our cluster but glad to hear that you were successful.
>>
>> LevelDB corruptions were one of our big problems.  Your note below about running RepairDB from Python is interesting.  At the time we were looking for a Ceph tool to run LevelDB repairs in order to get our OSDs back up and couldn't find one.  I felt like this is something that should be in the standard toolkit.
>>
>> Would be great to see this added some day, but in the meantime I will remember this option exists.  If you still have the Python script, perhaps you could post it as an example?

i just logged this feature on http://tracker.ceph.com/issues/17730, so
we don't forgot it!

>>
>> Thanks!
>>
>> -- Dan
>>
>>
>>> On Oct 20, 2016, at 01:42, Kostis Fardelas <dante1234@xxxxxxxxx> wrote:
>>>
>>> We pulled leveldb from upstream and fired leveldb.RepairDB against the
>>> OSD omap directory using a simple python script. Ultimately, that
>>> didn't make things forward. We resorted to check every object's
>>> timestamp/md5sum/attributes on the crashed OSD against the replicas in
>>> the cluster and at last took the way of discarding the journal, when
>>> we concluded with as much confidence as possible that we would not
>>> lose data.
>>>
>>> It would be really useful at that moment if we had a tool to inspect
>>> the journal's contents of the crashed OSD and limit the scope of the
>>> verification process.
>>>
>>> On 20 October 2016 at 08:15, Goncalo Borges
>>> <goncalo.borges@xxxxxxxxxxxxx> wrote:
>>>> Hi Kostis...
>>>> That is a tale from the dark side. Glad you recover it and that you were willing to doc it all up, and share it. Thank you for that,
>>>> Can I also ask which tool did you use to recover the leveldb?
>>>> Cheers
>>>> Goncalo
>>>> ________________________________________
>>>> From: ceph-users [ceph-users-bounces@xxxxxxxxxxxxxx] on behalf of Kostis Fardelas [dante1234@xxxxxxxxx]
>>>> Sent: 20 October 2016 09:09
>>>> To: ceph-users
>>>> Subject:  Surviving a ceph cluster outage: the hard way
>>>>
>>>> Hello cephers,
>>>> this is the blog post on our Ceph cluster's outage we experienced some
>>>> weeks ago and about how we managed to revive the cluster and our
>>>> clients's data.
>>>>
>>>> I hope it will prove useful for anyone who will find himself/herself
>>>> in a similar position. Thanks for everyone on the ceph-users and
>>>> ceph-devel lists who contributed to our inquiries during
>>>> troubleshooting.
>>>>
>>>> https://blog.noc.grnet.gr/2016/10/18/surviving-a-ceph-cluster-outage-the-hard-way/
>>>>
>>>> Regards,
>>>> Kostis
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Regards
Kefu Chai
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com