Re: Cluster Map Problems

John Wilkins <john.wilkins@xxxxxxxxxxx> · Wed, 27 Mar 2013 18:25:57 -0700

So the OSD you shutdown is down and in. How long does it stay in the
degraded state? In the docs here,
http://ceph.com/docs/master/rados/operations/monitoring-osd-pg/ , we
discuss the notion that a down OSD is not technically out of the
cluster for awhile. I believe the default value is 300 seconds, which
is about 5 minutes. From what I can see from your "ceph osd tree"
command, all your OSDs are running. You can change the time it takes
to mark a down OSD out. That's " mon osd down out interval", discussed
in this section:
http://ceph.com/docs/master/rados/operations/monitoring-osd-pg/#degraded

On Wed, Mar 27, 2013 at 5:56 PM, Martin Mailand <martin@xxxxxxxxxxxx> wrote:
> Hi,
>
> that's the config http://pastebin.com/2JzABSYt
> ceph osd dump http://pastebin.com/GSCGKL1k
> ceph osd tree http://pastebin.com/VSgPFRYv
>
> As far as I can tell they are not mapped right.
>
> sdmap e133 pool 'rbd' (2) object '2.31a' -> pg 2.f3caaf00 (2.300) -> up
> [13,23] acting [13,23]
>
> -martin
>
> On 28.03.2013 01:09, John Wilkins wrote:
>> We need a bit more information. If you can do: "ceph osd dump", "ceph
>> osd tree", and paste your ceph conf, we might get a bit further. The
>> CRUSH hierarchy looks okay. I can't see the replica size from this
>> though.
>>
>> Have you followed this procedure to see if your object is getting
>> remapped? http://ceph.com/docs/master/rados/operations/monitoring-osd-pg/#finding-an-object-location
>>
>> On Thu, Mar 21, 2013 at 12:02 PM, Martin Mailand <martin@xxxxxxxxxxxx> wrote:
>>> Hi,
>>>
>>> I want to change my crushmap to reflect my setup, I have two racks with
>>> each 3 hosts. I want to use for the rbd pool a replication size of 2.
>>> The failure domain should be the rack, so each replica should be in each
>>> rack. That works so far.
>>> But if I shutdown a host the clusters stays degraded, but I want that
>>> the now missing replicas get replicated to the two remaining hosts in
>>> this rack.
>>>
>>> Here is crushmap.
>>> http://pastebin.com/UaB6LfKs
>>>
>>> Any idea what I did wrong?
>>>
>>> -martin
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>

-- 
John Wilkins
Senior Technical Writer
Intank
john.wilkins@xxxxxxxxxxx
(415) 425-9599
http://inktank.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com