Re: all oas crush on start

Vladislav Gorbunov <vadikgo@xxxxxxxxx> · Wed, 17 Jul 2013 23:40:29 +1200

Sorry, not send to ceph-users later.

I check mon.1 log and found that cluster was not in HEALTH_OK when set
ruleset to iscsi:
2013-07-14 15:52:15.715871 7fe8a852a700  0 log [INF] : pgmap
v16861121: 19296 pgs: 19052 active+clean, 73
active+remapped+wait_backfill, 171 active+remapped+b
ackfilling; 9023 GB data, 18074 GB used, 95096 GB / 110 TB avail;
21245KB/s rd, 1892KB/s wr, 443op/s; 49203/4696557 degraded (1.048%)
2
2013-07-14 15:52:15.870389 7fe8a852a700  0 mon.1@0(leader) e23
handle_command mon_command(osd pool set iscsi crush_ruleset 3 v 0) v1
...
2013-07-14 15:52:35.930465 7fe8a852a700  1 mon.1@0(leader).osd e77415
prepare_failure osd.2 10.166.10.27:6801/12007 from osd.56
10.166.10.29:6896/18516 is reporting failure:1
2013-07-14 15:52:35.930641 7fe8a852a700  0 log [DBG] : osd.2
10.166.10.27:6801/12007 reported failed by osd.56
10.166.10.29:6896/18516

Could this be an indicator of distribution the bad map to cluster's
osd servers by osd.56? This means that you can not change the crushmap
of the cluster if it not in HEALTH_OK or you lost all cluster?

full log at https://dl.dropboxusercontent.com/u/2296931/ceph/ceph-mon.1.log.bak.zip
(1.7MB)

>If a bad map somehow got distributed to the OSDs then cleaning it up
is unfortunately going to take a lot of work without any well-defined
processes.
This means that all data was lost?

2013/7/17 Gregory Farnum <greg@xxxxxxxxxxx>:
> Have you changed either of these maps since you originally switched to
> use rule 3?
>
> Can you compare them to what you have on your test cluster? In
> particular I see that you have 0 weight for all the buckets in the
> crush pool, which I expect to misbehave but not to cause the OSD to
> crash everywhere.
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
>
> On Tue, Jul 16, 2013 at 4:00 PM, Vladislav Gorbunov <vadikgo@xxxxxxxxx> wrote:
>> output is in the attached files
>>
>> 2013/7/17 Gregory Farnum <greg@xxxxxxxxxxx>:
>>> The maps in the OSDs only would have gotten there from the monitors.
>>> If a bad map somehow got distributed to the OSDs then cleaning it up
>>> is unfortunately going to take a lot of work without any well-defined
>>> processes.
>>> So if you could just do "ceph osd crush dump" and "ceph osd dump" and
>>> provide the output from those commands, we can look at what the map
>>> actually has and go from there.
>>> -Greg
>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>
>>>
>>> On Tue, Jul 16, 2013 at 3:22 PM, Vladislav Gorbunov <vadikgo@xxxxxxxxx> wrote:
>>>> Gregory, thank for you help!
>>>> After all osd servers downed, i'am back rule set for the iscsi pool
>>>> back to default rule 0:
>>>> ceph osd pool set iscsi crush_ruleset 0
>>>> it does not help, all osd not started, except without data, with weight 0.
>>>> next i remove ruleset iscsi from crush map. It does not help too. And
>>>> after that i post crushmap to this mail list.
>>>> Is any method to extract crush map from downed osd server and inject
>>>> it to the mon server? from /var/lib/ceph/osd/ceph-2/current/omap
>>>> folder?
>>>>
>>>> 2013/7/17 Gregory Farnum <greg@xxxxxxxxxxx>:
>>>>> I notice that your first dump of the crush map didn't include rule #3.
>>>>> Are you sure you've injected it into the cluster? Try extracting it
>>>>> from the monitors and looking at that map directly, instead of a
>>>>> locally cached version.
>>>>> You mentioned some problem with OSDs being positioned wrong too, so
>>>>> you might look at "ceph osd tree" and look at the shape of the map.
>>>>> But it sounds to me like maybe there's a disconnect between what
>>>>> you've put into the cluster, and what you're looking at.
>>>>> -Greg
>>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com