Re: what happen to the OSDs if the OS disk dies?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nothing actually happens to your osds if your OS drive fails.  To prevent the unnecessary backfilling off of the server with the dead OS drive, you would set NOOUT in the cluster, reinstall the OS on a good drive, install ceph on it, and then restart the server.  The OSDs have all of the information they need to bring themselves back up and into the cluster.  Once they are back up, you unset noout and are good to go.

If the drives had already been marked out of the cluster, then set noout and manually mark them in via `ceph osd in #` and proceed as above.  It is a very simple process to replace the OS drive of a storage node.


David Turner | Cloud Operations Engineer | StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2760 | Mobile: 385.224.2943


If you are not the intended recipient of this message or received it erroneously, please notify the sender and delete it, together with any attachments, and be advised that any dissemination or copying of this message is prohibited.


________________________________________
From: ceph-users [ceph-users-bounces@xxxxxxxxxxxxxx] on behalf of Cybertinus [ceph@xxxxxxxxxxxxx]
Sent: Friday, August 12, 2016 7:31 AM
To: Félix Barbeira
Cc: ceph-users@xxxxxxxxxxxxxx
Subject: Re: what happen to the OSDs if the OS disk dies?

Hello Felix,

When you put your OS on a single drive and that drive fails, you will
loose all the OSDs on that machine, because the entier machine goes
down. The PGs that now miss a partner are going to be replicated again.
So, in your case, the PGs that are on those 11 OSDs.
This rebuilding doesn't start right away, so you can safely reboot an
OSD host without starting a major rebalance of your data.

I would put 2 drives in RAID1 if I were you. Putting 2 SSDs in the back
2,5" slots, like suggested by Brian, sounds like the best option to me.
This way you don't loose a massive storage amount (2x10x8 = 160 TB you
would loose otherwise, just for the OS installation...)

---
Kind regards,
Cybertinus

On 12-08-2016 13:41, Félix Barbeira wrote:

> Hi,
>
> I'm planning to make a ceph cluster but I have a serious doubt. At this
> moment we have ~10 servers DELL R730xd with 12x4TB SATA disks. The
> official ceph docs says:
>
> "We recommend using a dedicated drive for the operating system and
> software, and one drive for each Ceph OSD Daemon you run on the host."
>
> I could use for example 1 disk for the OS and 11 for OSD data. In the
> operating system I would run 11 daemons to control the OSDs. But...what
> happen to the cluster if the disk with the OS fails?? maybe the cluster
> thinks that 11 OSD failed and try to replicate all that data over the
> cluster...that sounds no good.
>
> Should I use 2 disks for the OS making a RAID1? in this case I'm
> "wasting" 8TB only for ~10GB that the OS needs.
>
> In all the docs that i've been reading says ceph has no unique single
> point of failure, so I think that this scenario must have a optimal
> solution, maybe somebody could help me.
>
> Thanks in advance.
> --
> Félix Barbeira.
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux