Ceph in Production: best practice to monitor OSD up/down status

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I started to work with CEPH few weeks ago, I might ask a very newbie
question, but I could not find an answer in the docs or in the ml
archive for this.

Quick description of my setup:
I have a ceph cluster with two servers. Each server has 3 SSD drives I
use for journal only. To map to different failure domains SAS disks
that keep a journal to the same SSD drive, I wrote my own crushmap.
I have now a total of 36OSD. Ceph health returns HEALTH_OK.
I run the cluster with a couple of pools with size=3 and min_size=3


Production operations questions:
I manually stopped some OSDs to simulate a failure.

As far as I understood, an "OSD down" condition is not enough to make
CEPH start making new copies of objects. I noticed that I must mark
the OSD as "out" to make ceph produce new copies.
As far as I understood min_size=3 puts the object in readonly if there
are not at least 3 copies of the object available.

Is this behavior correct or I made some mistake creating the cluster ?
Should I expect ceph to produce automatically a new copy for objects
when some OSDs are down ?
There is any option to mark automatically "out" OSDs that go "down" ?

thanks

Saverio
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux