Re: what happen to the OSDs if the OS disk dies?

"wido@xxxxxxxx" <wido@xxxxxxxx> · Sat, 13 Aug 2016 09:43:26 +0200

> Op 13 aug. 2016 om 08:58 heeft Georgios Dimitrakakis <giorgis@xxxxxxxxxxxx> het volgende geschreven:
> 
> 
>>> Op 13 aug. 2016 om 03:19 heeft Bill Sharer  het volgende geschreven:
>>> 
>>> If all the system disk does is handle the o/s (ie osd journals are
>>> on dedicated or osd drives as well), no problem.Â Just rebuild the
>>> system and copy the ceph.conf back in when you re-install ceph.Â
>>> Keep a spare copy of your original fstab to keep your osd filesystem
>>> mounts straight.
>> 
>> With systems deployed with ceph-disk/ceph-deploy you no longer need a
>> fstab. Udev handles it.
>> 
>>> Just keep in mind that you are down 11 osds while that system drive
>>> gets rebuilt though.Â It's safer to do 10 osds and then have a
>>> mirror set for the system disk.
>> 
>> In the years that I run Ceph I rarely see OS disks fail. Why bother?
>> Ceph is designed for failure.
>> 
>> I would not sacrifice a OSD slot for a OS disk. Also, let's say a
>> additional OS disk is €100.
>> 
>> If you put that disk in 20 machines that's €2.000. For that money
>> you can even buy a additional chassis.
>> 
>> No, I would run on a single OS disk. It fails? Let it fail. Re-install
>> and you're good again.
>> 
>> Ceph makes sure the data is safe.
>> 
> 
> Wido,
> 
> can you elaborate a little bit more on this? How does CEPH achieve that? Is it by redundant MONs?
> 

No, Ceph replicates over hosts by default. So you can loose a host and the other ones will have copies.

> To my understanding the OSD mapping is needed to have the cluster back. In our setup (I assume in others as well) that is stored in the OS disk.Furthermore, our MONs are running on the same host as OSDs. So if the OS disk fails not only we loose the OSD host but we also loose the MON node. Is there another way to be protected by such a failure besides additional MONs?
> 

Aha, MON on the OSD host. I never recommend that. Try to use dedicated machines with a good SSD for MONs.

Technically you can run the MON on the OSD nodes, but I always try to avoid it. It just isn't practical when stuff really goes wrong.

Wido

> We recently had a problem where a user accidentally deleted a volume. Of course this has nothing to do with OS disk failure itself but we 've been in the loop to start looking for other possible failures on our system that could jeopardize data and this thread got my attention.
> 
> 
> Warmest regards,
> 
> George
> 
> 
>> Wido
>> 
>> Bill Sharer
>> 
>>> On 08/12/2016 03:33 PM, Ronny Aasen wrote:
>>> 
>>>> On 12.08.2016 13:41, FÃ©lix Barbeira wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> I'm planning to make a ceph cluster but I have a serious doubt. At
>>>> this moment we have ~10 servers DELL R730xd with 12x4TB SATA
>>>> disks. The official ceph docs says:
>>>> 
>>>> "We recommend using a dedicated drive for the operating system and
>>>> software, and one drive for each Ceph OSD Daemon you run on the
>>>> host."
>>>> 
>>>> I could use for example 1 disk for the OS and 11 for OSD data. In
>>>> the operating system I would run 11 daemons to control the OSDs.
>>>> But...what happen to the cluster if the disk with the OS fails??
>>>> maybe the cluster thinks that 11 OSD failed and try to replicate
>>>> all that data over the cluster...that sounds no good.
>>>> 
>>>> Should I use 2 disks for the OS making a RAID1? in this case I'm
>>>> "wasting" 8TB only for ~10GB that the OS needs.
>>>> 
>>>> In all the docs that i've been reading says ceph has no unique
>>>> single point of failure, so I think that this scenario must have a
>>>> optimal solution, maybe somebody could help me.
>>>> 
>>>> Thanks in advance.
>>>> 
>>>> --
>>>> 
>>>> FÃ©lix Barbeira.
>>> if you do not have dedicated slots on the back for OS disks, then i
>>> would recomend using SATADOM flash modules directly into a SATA port
>>> internal in the machine. Saves you 2 slots for osd's and they are
>>> quite reliable. you could even use 2 sd cards if your machine have
>>> the internal SD slot
>>> 
>>> 
>> http://www.dell.com/downloads/global/products/pedge/en/poweredge-idsdm-whitepaper-en.pdf
>>> [1]
>>> 
>>> kind regards
>>> Ronny Aasen
>>> 
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx [2]
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [3]
>>> 
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-u
>> ph.com
>> http://li
>> 
>>> i/ceph-users-ceph.com
>> 
>> 
>> Links:
>> ------
>> [1]
>> http://www.dell.com/downloads/global/products/pedge/en/poweredge-idsdm-whitepaper-en.pdf
>> [2] mailto:ceph-users@xxxxxxxxxxxxxx
>> [3] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> [4] mailto:bsharer@xxxxxxxxxxxxxx
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com