Re: how possible is that ceph cluster crash

"Brian ::" <bc@xxxxxxxx> · Sat, 19 Nov 2016 07:06:24 +0000

Thanks Nick / Samuel,

It's definitely worthwhile to explain exactly why this is such a bad
idea. I think it will prevent people from ever doing it - rather than
just telling people not to do it.

On Sat, Nov 19, 2016 at 12:30 AM, Samuel Just <sjust@xxxxxxxxxx> wrote:
> Many reasons:
>
> 1) You will eventually get a DC wide power event anyway at which point
> probably most of the OSDs will have hopelessly corrupted internal xfs
> structures (yes, I have seen this happen to a poor soul with a DC with
> redundant power).
> 2) Even in the case of a single rack/node power failure, the biggest
> danger isn't that the OSDs don't start.  It's that they *do start*,
> but forgot or arbitrarily corrupted a random subset of transactions
> they told other osds and clients that they committed.  The exact
> impact would be random, but for sure, any guarantees Ceph normally
> provides would be out the window.  RBD devices could have random byte
> ranges zapped back in time (not great if they're the offsets assigned
> to your database or fs journal...) for instance.
> 3) Deliberately powercycling a node counts as a power failure if you
> don't stop services and sync etc first.
>
> In other words, don't mess with the definition of "committing a
> transaction" if you value your data.
> -Sam "just say no" Just
>
> On Fri, Nov 18, 2016 at 4:04 PM, Nick Fisk <nick@xxxxxxxxxx> wrote:
>> Yes, because these things happen
>>
>> http://www.theregister.co.uk/2016/11/15/memset_power_cut_service_interruption/
>>
>> We had customers who had kit in this DC.
>>
>> To use your analogy, it's like crossing the road at traffic lights but not
>> checking cars have stopped. You might be OK 99%of the time, but sooner or
>> later it will bite you in the arse and it won't be pretty.
>>
>> ________________________________
>> From: "Brian ::" <bc@xxxxxxxx>
>> Sent: 18 Nov 2016 11:52 p.m.
>> To: sjust@xxxxxxxxxx
>> Cc: Craig Chi; ceph-users@xxxxxxxxxxxxxx; Nick Fisk
>> Subject: Re:  how possible is that ceph cluster crash
>>
>>> X-Assp-URIBLcache failed: '1e100.net'(black.uribl.com)
>>> X-Assp-Spam-Level: *****
>>> X-Assp-Envelope-From: bc@xxxxxxxx
>>> X-Assp-Intended-For: nick@xxxxxxxxxx
>>> X-Assp-ID: ASSP.fisk.me.uk (47951-11296)
>>> X-Assp-Version: 1.9.1.4(1.0.00)
>>>
>>>
>>> This is like your mother telling not to cross the road when you were 4
>>> years of age but not telling you it was because you could be flattened
>>> by a car :)
>>>
>>> Can you expand on your answer? If you are in a DC with AB power,
>>> redundant UPS, dual feed from the electric company, onsite generators,
>>> dual PSU servers, is it still a bad idea?
>>>
>>>
>>>
>>>
>>> On Fri, Nov 18, 2016 at 6:52 PM, Samuel Just <sjust@xxxxxxxxxx> wrote:
>>>>
>>>> Never *ever* use nobarrier with ceph under *any* circumstances.  I
>>>> cannot stress this enough.
>>>> -Sam
>>>>
>>>> On Fri, Nov 18, 2016 at 10:39 AM, Craig Chi <craigchi@xxxxxxxxxxxx>
>>>> wrote:
>>>>>
>>>>> Hi Nick and other Cephers,
>>>>>
>>>>> Thanks for your reply.
>>>>>
>>>>>> 2) Config Errors
>>>>>> This can be an easy one to say you are safe from. But I would say most
>>>>>> outages and data loss incidents I have seen on the mailing
>>>>>> lists have been due to poor hardware choice or configuring options such
>>>>>> as
>>>>>> size=2, min_size=1 or enabling stuff like nobarriers.
>>>>>
>>>>>
>>>>> I am wondering the pros and cons of the nobarrier option used by Ceph.
>>>>>
>>>>> It is well known that nobarrier is dangerous when power outage happens,
>>>>> but
>>>>> if we already have replicas in different racks or PDUs, will Ceph reduce
>>>>> the
>>>>> risk of data lost with this option?
>>>>>
>>>>> I have seen many performance tuning articles providing nobarrier option
>>>>> in
>>>>> xfs, but there are not many of then mention the trade-off of nobarrier.
>>>>>
>>>>> Is it really unacceptable to use nobarrier in production environment? I
>>>>> will
>>>>> be much grateful if you guys are willing to share any experiences about
>>>>> nobarrier and xfs.
>>>>>
>>>>> Sincerely,
>>>>> Craig Chi (Product Developer)
>>>>> Synology Inc. Taipei, Taiwan. Ext. 361
>>>>>
>>>>> On 2016-11-17 05:04, Nick Fisk <nick@xxxxxxxxxx> wrote:
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf
>>>>>> Of
>>>>>> Pedro Benites
>>>>>> Sent: 16 November 2016 17:51
>>>>>> To: ceph-users@xxxxxxxxxxxxxx
>>>>>> Subject:  how possible is that ceph cluster crash
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have a ceph cluster with 50 TB, with 15 osds, it is working fine for
>>>>>> one
>>>>>> year and I would like to grow it and migrate all my old
>>>>>
>>>>> storage,
>>>>>>
>>>>>> about 100 TB to ceph, but I have a doubt. How possible is that the
>>>>>> cluster
>>>>>> fail and everything went very bad?
>>>>>
>>>>>
>>>>> Everything is possible, I think there are 3 main risks
>>>>>
>>>>> 1) Hardware failure
>>>>> I would say Ceph is probably one of the safest options in regards to
>>>>> hardware failures, certainly if you start using 4TB+ disks.
>>>>>
>>>>> 2) Config Errors
>>>>> This can be an easy one to say you are safe from. But I would say most
>>>>> outages and data loss incidents I have seen on the mailing
>>>>> lists have been due to poor hardware choice or configuring options such
>>>>> as
>>>>> size=2, min_size=1 or enabling stuff like nobarriers.
>>>>>
>>>>> 3) Ceph Bugs
>>>>> Probably the rarest, but potentially the most scary as you have less
>>>>> control. They do happen and it's something to be aware of
>>>>>
>>>>> How reliable is ceph?
>>>>>>
>>>>>> What is the risk about lose my data.? is necessary backup my data?
>>>>>
>>>>>
>>>>> Yes, always backup your data, no matter solution you use. Just like RAID
>>>>> !=
>>>>> Backup, neither does ceph.
>>>>>
>>>>>>
>>>>>> Regards.
>>>>>> Pedro.
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Sent from Synology MailPlus
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>
>>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com