On Mon, Mar 23, 2015 at 7:17 AM, Saverio Proto <zioproto@xxxxxxxxx> wrote: > Hello, > > thanks for the answers. > > This was exacly what I was looking for: > > mon_osd_down_out_interval = 900 > > I was not waiting long enoght to see my cluster recovering by itself. > That's why I tried to increase min_size, because I did not understand > what min_size was for. > > Now that I know what is min_size, I guess the best setting for me is > min_size = 1 because I would like to be able to make I/O operations > even of only 1 copy is left. I'd strongly recommend leaving it at two — if you reduce it to 1 then you can lose data by having just one disk die at an inopportune moment, whereas if you leave it at 2 the system won't accept any writes to only one hard drive. Leaving it at two the system will still try and re-replicate back up to three copies after "mon osd down out interval" time has elapsed from a failure. :) -Greg > > Thanks to all for helping ! > > Saverio > > > > 2015-03-23 14:58 GMT+01:00 Gregory Farnum <greg@xxxxxxxxxxx>: >> On Sun, Mar 22, 2015 at 2:55 AM, Saverio Proto <zioproto@xxxxxxxxx> wrote: >>> Hello, >>> >>> I started to work with CEPH few weeks ago, I might ask a very newbie >>> question, but I could not find an answer in the docs or in the ml >>> archive for this. >>> >>> Quick description of my setup: >>> I have a ceph cluster with two servers. Each server has 3 SSD drives I >>> use for journal only. To map to different failure domains SAS disks >>> that keep a journal to the same SSD drive, I wrote my own crushmap. >>> I have now a total of 36OSD. Ceph health returns HEALTH_OK. >>> I run the cluster with a couple of pools with size=3 and min_size=3 >>> >>> >>> Production operations questions: >>> I manually stopped some OSDs to simulate a failure. >>> >>> As far as I understood, an "OSD down" condition is not enough to make >>> CEPH start making new copies of objects. I noticed that I must mark >>> the OSD as "out" to make ceph produce new copies. >>> As far as I understood min_size=3 puts the object in readonly if there >>> are not at least 3 copies of the object available. >> >> That is correct, but the default with size 3 is 2 and you probably >> want to do that instead. If you have size==min_size on firefly >> releases and lose an OSD it can't do recovery so that PG is stuck >> without manual intervention. :( This is because of some quirks about >> how the OSD peering and recovery works, so you'd be forgiven for >> thinking it would recover nicely. >> (This is changed in the upcoming Hammer release, but you probably >> still want to allow cluster activity when an OSD fails, unless you're >> very confident in their uptime and more concerned about durability >> than availability.) >> -Greg >> >>> >>> Is this behavior correct or I made some mistake creating the cluster ? >>> Should I expect ceph to produce automatically a new copy for objects >>> when some OSDs are down ? >>> There is any option to mark automatically "out" OSDs that go "down" ? >>> >>> thanks >>> >>> Saverio >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com