Re: how possible is that ceph cluster crash

Samuel Just <sjust@xxxxxxxxxx> · Fri, 18 Nov 2016 16:30:32 -0800

Many reasons:

1) You will eventually get a DC wide power event anyway at which point
probably most of the OSDs will have hopelessly corrupted internal xfs
structures (yes, I have seen this happen to a poor soul with a DC with
redundant power).
2) Even in the case of a single rack/node power failure, the biggest
danger isn't that the OSDs don't start.  It's that they *do start*,
but forgot or arbitrarily corrupted a random subset of transactions
they told other osds and clients that they committed.  The exact
impact would be random, but for sure, any guarantees Ceph normally
provides would be out the window.  RBD devices could have random byte
ranges zapped back in time (not great if they're the offsets assigned
to your database or fs journal...) for instance.
3) Deliberately powercycling a node counts as a power failure if you
don't stop services and sync etc first.

In other words, don't mess with the definition of "committing a
transaction" if you value your data.
-Sam "just say no" Just

On Fri, Nov 18, 2016 at 4:04 PM, Nick Fisk <nick@xxxxxxxxxx> wrote:
> Yes, because these things happen
>
> http://www.theregister.co.uk/2016/11/15/memset_power_cut_service_interruption/
>
> We had customers who had kit in this DC.
>
> To use your analogy, it's like crossing the road at traffic lights but not
> checking cars have stopped. You might be OK 99%of the time, but sooner or
> later it will bite you in the arse and it won't be pretty.
>
> ________________________________
> From: "Brian ::" <bc@xxxxxxxx>
> Sent: 18 Nov 2016 11:52 p.m.
> To: sjust@xxxxxxxxxx
> Cc: Craig Chi; ceph-users@xxxxxxxxxxxxxx; Nick Fisk
> Subject: Re:  how possible is that ceph cluster crash
>
>> X-Assp-URIBLcache failed: '1e100.net'(black.uribl.com)
>> X-Assp-Spam-Level: *****
>> X-Assp-Envelope-From: bc@xxxxxxxx
>> X-Assp-Intended-For: nick@xxxxxxxxxx
>> X-Assp-ID: ASSP.fisk.me.uk (47951-11296)
>> X-Assp-Version: 1.9.1.4(1.0.00)
>>
>>
>> This is like your mother telling not to cross the road when you were 4
>> years of age but not telling you it was because you could be flattened
>> by a car :)
>>
>> Can you expand on your answer? If you are in a DC with AB power,
>> redundant UPS, dual feed from the electric company, onsite generators,
>> dual PSU servers, is it still a bad idea?
>>
>>
>>
>>
>> On Fri, Nov 18, 2016 at 6:52 PM, Samuel Just <sjust@xxxxxxxxxx> wrote:
>>>
>>> Never *ever* use nobarrier with ceph under *any* circumstances.  I
>>> cannot stress this enough.
>>> -Sam
>>>
>>> On Fri, Nov 18, 2016 at 10:39 AM, Craig Chi <craigchi@xxxxxxxxxxxx>
>>> wrote:
>>>>
>>>> Hi Nick and other Cephers,
>>>>
>>>> Thanks for your reply.
>>>>
>>>>> 2) Config Errors
>>>>> This can be an easy one to say you are safe from. But I would say most
>>>>> outages and data loss incidents I have seen on the mailing
>>>>> lists have been due to poor hardware choice or configuring options such
>>>>> as
>>>>> size=2, min_size=1 or enabling stuff like nobarriers.
>>>>
>>>>
>>>> I am wondering the pros and cons of the nobarrier option used by Ceph.
>>>>
>>>> It is well known that nobarrier is dangerous when power outage happens,
>>>> but
>>>> if we already have replicas in different racks or PDUs, will Ceph reduce
>>>> the
>>>> risk of data lost with this option?
>>>>
>>>> I have seen many performance tuning articles providing nobarrier option
>>>> in
>>>> xfs, but there are not many of then mention the trade-off of nobarrier.
>>>>
>>>> Is it really unacceptable to use nobarrier in production environment? I
>>>> will
>>>> be much grateful if you guys are willing to share any experiences about
>>>> nobarrier and xfs.
>>>>
>>>> Sincerely,
>>>> Craig Chi (Product Developer)
>>>> Synology Inc. Taipei, Taiwan. Ext. 361
>>>>
>>>> On 2016-11-17 05:04, Nick Fisk <nick@xxxxxxxxxx> wrote:
>>>>
>>>>> -----Original Message-----
>>>>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf
>>>>> Of
>>>>> Pedro Benites
>>>>> Sent: 16 November 2016 17:51
>>>>> To: ceph-users@xxxxxxxxxxxxxx
>>>>> Subject:  how possible is that ceph cluster crash
>>>>>
>>>>> Hi,
>>>>>
>>>>> I have a ceph cluster with 50 TB, with 15 osds, it is working fine for
>>>>> one
>>>>> year and I would like to grow it and migrate all my old
>>>>
>>>> storage,
>>>>>
>>>>> about 100 TB to ceph, but I have a doubt. How possible is that the
>>>>> cluster
>>>>> fail and everything went very bad?
>>>>
>>>>
>>>> Everything is possible, I think there are 3 main risks
>>>>
>>>> 1) Hardware failure
>>>> I would say Ceph is probably one of the safest options in regards to
>>>> hardware failures, certainly if you start using 4TB+ disks.
>>>>
>>>> 2) Config Errors
>>>> This can be an easy one to say you are safe from. But I would say most
>>>> outages and data loss incidents I have seen on the mailing
>>>> lists have been due to poor hardware choice or configuring options such
>>>> as
>>>> size=2, min_size=1 or enabling stuff like nobarriers.
>>>>
>>>> 3) Ceph Bugs
>>>> Probably the rarest, but potentially the most scary as you have less
>>>> control. They do happen and it's something to be aware of
>>>>
>>>> How reliable is ceph?
>>>>>
>>>>> What is the risk about lose my data.? is necessary backup my data?
>>>>
>>>>
>>>> Yes, always backup your data, no matter solution you use. Just like RAID
>>>> !=
>>>> Backup, neither does ceph.
>>>>
>>>>>
>>>>> Regards.
>>>>> Pedro.
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>>>
>>>>
>>>> Sent from Synology MailPlus
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com