Re: all oas crush on start

Gregory Farnum <greg@xxxxxxxxxxx> · Tue, 16 Jul 2013 16:08:20 -0700

Have you changed either of these maps since you originally switched to
use rule 3?

Can you compare them to what you have on your test cluster? In
particular I see that you have 0 weight for all the buckets in the
crush pool, which I expect to misbehave but not to cause the OSD to
crash everywhere.
Software Engineer #42 @ http://inktank.com | http://ceph.com

On Tue, Jul 16, 2013 at 4:00 PM, Vladislav Gorbunov <vadikgo@xxxxxxxxx> wrote:
> output is in the attached files
>
> 2013/7/17 Gregory Farnum <greg@xxxxxxxxxxx>:
>> The maps in the OSDs only would have gotten there from the monitors.
>> If a bad map somehow got distributed to the OSDs then cleaning it up
>> is unfortunately going to take a lot of work without any well-defined
>> processes.
>> So if you could just do "ceph osd crush dump" and "ceph osd dump" and
>> provide the output from those commands, we can look at what the map
>> actually has and go from there.
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>>
>> On Tue, Jul 16, 2013 at 3:22 PM, Vladislav Gorbunov <vadikgo@xxxxxxxxx> wrote:
>>> Gregory, thank for you help!
>>> After all osd servers downed, i'am back rule set for the iscsi pool
>>> back to default rule 0:
>>> ceph osd pool set iscsi crush_ruleset 0
>>> it does not help, all osd not started, except without data, with weight 0.
>>> next i remove ruleset iscsi from crush map. It does not help too. And
>>> after that i post crushmap to this mail list.
>>> Is any method to extract crush map from downed osd server and inject
>>> it to the mon server? from /var/lib/ceph/osd/ceph-2/current/omap
>>> folder?
>>>
>>> 2013/7/17 Gregory Farnum <greg@xxxxxxxxxxx>:
>>>> I notice that your first dump of the crush map didn't include rule #3.
>>>> Are you sure you've injected it into the cluster? Try extracting it
>>>> from the monitors and looking at that map directly, instead of a
>>>> locally cached version.
>>>> You mentioned some problem with OSDs being positioned wrong too, so
>>>> you might look at "ceph osd tree" and look at the shape of the map.
>>>> But it sounds to me like maybe there's a disconnect between what
>>>> you've put into the cluster, and what you're looking at.
>>>> -Greg
>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com