Re: Cluster Map Problems

Gregory Farnum <greg@xxxxxxxxxxx> · Thu, 28 Mar 2013 12:55:12 -0700



Hmm. The monitor code for checking this all looks good to me. Can you
go to one of your monitor nodes and dump the config?
(http://ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=admin%20socket#viewing-a-configuration-at-runtime)
-Greg

On Thu, Mar 28, 2013 at 12:33 PM, Martin Mailand <martin@xxxxxxxxxxxx> wrote:
> Hi,
>
> I get the same behavior an new created cluster as well, no changes to
> the cluster config at all.
> I stop the osd.1, after 20 seconds it got marked down. But it never get
> marked out.
>
> ceph version 0.59 (cbae6a435c62899f857775f66659de052fb0e759)
>
> -martin
>
> On 28.03.2013 19:48, John Wilkins wrote:
>> Martin,
>>
>> Greg is talking about noout. With Ceph, you can specifically preclude
>> OSDs from being marked out when down to prevent rebalancing--e.g.,
>> during upgrades, short-term maintenance, etc.
>>
>> http://ceph.com/docs/master/rados/operations/troubleshooting-osd/#stopping-w-out-rebalancing
>>
>> On Thu, Mar 28, 2013 at 11:12 AM, Martin Mailand <martin@xxxxxxxxxxxx> wrote:
>>> Hi Greg,
>>>
>>> setting the osd manually out triggered the recovery.
>>> But now it is the question, why is the osd not marked out after 300
>>> seconds? That's a default cluster, I use the 0.59 build from your site.
>>> And I didn't change any value, except for the crushmap.
>>>
>>> That's my ceph.conf.
>>>
>>> -martin
>>>
>>> [global]
>>>         auth cluster requierd = none
>>>         auth service required = none
>>>         auth client required = none
>>> #       log file = ""
>>>         log_max_recent=100
>>>         log_max_new=100
>>>
>>> [mon]
>>>         mon data = /data/mon.$id
>>> [mon.a]
>>>         host = store1
>>>         mon addr = 192.168.195.31:6789
>>> [mon.b]
>>>         host = store3
>>>         mon addr = 192.168.195.33:6789
>>> [mon.c]
>>>         host = store5
>>>         mon addr = 192.168.195.35:6789
>>> [osd]
>>>         journal aio = true
>>>         osd data = /data/osd.$id
>>>         osd mount options btrfs = rw,noatime,nodiratime,autodefrag
>>>         osd mkfs options btrfs = -n 32k -l 32k
>>>
>>> [osd.0]
>>>         host = store1
>>>         osd journal = /dev/sdg1
>>>         btrfs devs = /dev/sdc
>>> [osd.1]
>>>         host = store1
>>>         osd journal = /dev/sdh1
>>>         btrfs devs = /dev/sdd
>>> [osd.2]
>>>         host = store1
>>>         osd journal = /dev/sdi1
>>>         btrfs devs = /dev/sde
>>> [osd.3]
>>>         host = store1
>>>         osd journal = /dev/sdj1
>>>         btrfs devs = /dev/sdf
>>> [osd.4]
>>>         host = store2
>>>         osd journal = /dev/sdg1
>>>         btrfs devs = /dev/sdc
>>> [osd.5]
>>>         host = store2
>>>         osd journal = /dev/sdh1
>>>         btrfs devs = /dev/sdd
>>> [osd.6]
>>>         host = store2
>>>         osd journal = /dev/sdi1
>>>         btrfs devs = /dev/sde
>>> [osd.7]
>>>         host = store2
>>>         osd journal = /dev/sdj1
>>>         btrfs devs = /dev/sdf
>>> [osd.8]
>>>         host = store3
>>>         osd journal = /dev/sdg1
>>>         btrfs devs = /dev/sdc
>>> [osd.9]
>>>         host = store3
>>>         osd journal = /dev/sdh1
>>>         btrfs devs = /dev/sdd
>>> [osd.10]
>>>         host = store3
>>>         osd journal = /dev/sdi1
>>>         btrfs devs = /dev/sde
>>> [osd.11]
>>>         host = store3
>>>         osd journal = /dev/sdj1
>>>         btrfs devs = /dev/sdf
>>> [osd.12]
>>>         host = store4
>>>         osd journal = /dev/sdg1
>>>         btrfs devs = /dev/sdc
>>> [osd.13]
>>>         host = store4
>>>         osd journal = /dev/sdh1
>>>         btrfs devs = /dev/sdd
>>> [osd.14]
>>>         host = store4
>>>         osd journal = /dev/sdi1
>>>         btrfs devs = /dev/sde
>>> [osd.15]
>>>         host = store4
>>>         osd journal = /dev/sdj1
>>>         btrfs devs = /dev/sdf
>>> [osd.16]
>>>         host = store5
>>>         osd journal = /dev/sdg1
>>>         btrfs devs = /dev/sdc
>>> [osd.17]
>>>         host = store5
>>>         osd journal = /dev/sdh1
>>>         btrfs devs = /dev/sdd
>>> [osd.18]
>>>         host = store5
>>>         osd journal = /dev/sdi1
>>>         btrfs devs = /dev/sde
>>> [osd.19]
>>>         host = store5
>>>         osd journal = /dev/sdj1
>>>         btrfs devs = /dev/sdf
>>> [osd.20]
>>>         host = store6
>>>         osd journal = /dev/sdg1
>>>         btrfs devs = /dev/sdc
>>> [osd.21]
>>>         host = store6
>>>         osd journal = /dev/sdh1
>>>         btrfs devs = /dev/sdd
>>> [osd.22]
>>>         host = store6
>>>         osd journal = /dev/sdi1
>>>         btrfs devs = /dev/sde
>>> [osd.23]
>>>         host = store6
>>>         osd journal = /dev/sdj1
>>>         btrfs devs = /dev/sdf
>>>
>>>
>>> On 28.03.2013 19:01, Gregory Farnum wrote:
>>>> Your crush map looks fine to me. I'm saying that your ceph -s output
>>>> showed the OSD still hadn't been marked out. No data will be migrated
>>>> until it's marked out.
>>>> After ten minutes it should have been marked out, but that's based on
>>>> a number of factors you have some control over. If you just want a
>>>> quick check of your crush map you can mark it out manually, too.
>>>> -Greg
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com