Re: Ceph in Production: best practice to monitor OSD up/down status

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Mar 22, 2015 at 2:55 AM, Saverio Proto <zioproto@xxxxxxxxx> wrote:
> Hello,
>
> I started to work with CEPH few weeks ago, I might ask a very newbie
> question, but I could not find an answer in the docs or in the ml
> archive for this.
>
> Quick description of my setup:
> I have a ceph cluster with two servers. Each server has 3 SSD drives I
> use for journal only. To map to different failure domains SAS disks
> that keep a journal to the same SSD drive, I wrote my own crushmap.
> I have now a total of 36OSD. Ceph health returns HEALTH_OK.
> I run the cluster with a couple of pools with size=3 and min_size=3
>
>
> Production operations questions:
> I manually stopped some OSDs to simulate a failure.
>
> As far as I understood, an "OSD down" condition is not enough to make
> CEPH start making new copies of objects. I noticed that I must mark
> the OSD as "out" to make ceph produce new copies.
> As far as I understood min_size=3 puts the object in readonly if there
> are not at least 3 copies of the object available.

That is correct, but the default with size 3 is 2 and you probably
want to do that instead. If you have size==min_size on firefly
releases and lose an OSD it can't do recovery so that PG is stuck
without manual intervention. :( This is because of some quirks about
how the OSD peering and recovery works, so you'd be forgiven for
thinking it would recover nicely.
(This is changed in the upcoming Hammer release, but you probably
still want to allow cluster activity when an OSD fails, unless you're
very confident in their uptime and more concerned about durability
than availability.)
-Greg

>
> Is this behavior correct or I made some mistake creating the cluster ?
> Should I expect ceph to produce automatically a new copy for objects
> when some OSDs are down ?
> There is any option to mark automatically "out" OSDs that go "down" ?
>
> thanks
>
> Saverio
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux