Have you changed either of these maps since you originally switched to use rule 3? Can you compare them to what you have on your test cluster? In particular I see that you have 0 weight for all the buckets in the crush pool, which I expect to misbehave but not to cause the OSD to crash everywhere. Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Jul 16, 2013 at 4:00 PM, Vladislav Gorbunov <vadikgo@xxxxxxxxx> wrote: > output is in the attached files > > 2013/7/17 Gregory Farnum <greg@xxxxxxxxxxx>: >> The maps in the OSDs only would have gotten there from the monitors. >> If a bad map somehow got distributed to the OSDs then cleaning it up >> is unfortunately going to take a lot of work without any well-defined >> processes. >> So if you could just do "ceph osd crush dump" and "ceph osd dump" and >> provide the output from those commands, we can look at what the map >> actually has and go from there. >> -Greg >> Software Engineer #42 @ http://inktank.com | http://ceph.com >> >> >> On Tue, Jul 16, 2013 at 3:22 PM, Vladislav Gorbunov <vadikgo@xxxxxxxxx> wrote: >>> Gregory, thank for you help! >>> After all osd servers downed, i'am back rule set for the iscsi pool >>> back to default rule 0: >>> ceph osd pool set iscsi crush_ruleset 0 >>> it does not help, all osd not started, except without data, with weight 0. >>> next i remove ruleset iscsi from crush map. It does not help too. And >>> after that i post crushmap to this mail list. >>> Is any method to extract crush map from downed osd server and inject >>> it to the mon server? from /var/lib/ceph/osd/ceph-2/current/omap >>> folder? >>> >>> 2013/7/17 Gregory Farnum <greg@xxxxxxxxxxx>: >>>> I notice that your first dump of the crush map didn't include rule #3. >>>> Are you sure you've injected it into the cluster? Try extracting it >>>> from the monitors and looking at that map directly, instead of a >>>> locally cached version. >>>> You mentioned some problem with OSDs being positioned wrong too, so >>>> you might look at "ceph osd tree" and look at the shape of the map. >>>> But it sounds to me like maybe there's a disconnect between what >>>> you've put into the cluster, and what you're looking at. >>>> -Greg >>>> Software Engineer #42 @ http://inktank.com | http://ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com