Martin, I'm just speculating: since I just rewrote the networking section and there is an empty mon_host value, and I do recall a chat last week where mon_host was considered a different setting now, maybe you might try specifying: [mon.a] mon host = store1 mon addr = 192.168.195.31:6789 etc. for monitors. I'm assuming that's not the case, but I want to make sure my docs are right on this point. On Thu, Mar 28, 2013 at 3:24 PM, Martin Mailand <martin@xxxxxxxxxxxx> wrote: > Hi John, > > my ceph.conf is a bit further down in this email. > > -martin > > Am 28.03.2013 23:21, schrieb John Wilkins: > >> Martin, >> >> Would you mind posting your Ceph configuration file too? I don't see >> any value set for "mon_host": "" >> >> On Thu, Mar 28, 2013 at 1:04 PM, Martin Mailand <martin@xxxxxxxxxxxx> >> wrote: >>> >>> Hi Greg, >>> >>> the dump from mon.a is attached. >>> >>> -martin >>> >>> On 28.03.2013 20:55, Gregory Farnum wrote: >>>> >>>> Hmm. The monitor code for checking this all looks good to me. Can you >>>> go to one of your monitor nodes and dump the config? >>>> >>>> (http://ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=admin%20socket#viewing-a-configuration-at-runtime) >>>> -Greg >>>> >>>> On Thu, Mar 28, 2013 at 12:33 PM, Martin Mailand <martin@xxxxxxxxxxxx> >>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I get the same behavior an new created cluster as well, no changes to >>>>> the cluster config at all. >>>>> I stop the osd.1, after 20 seconds it got marked down. But it never get >>>>> marked out. >>>>> >>>>> ceph version 0.59 (cbae6a435c62899f857775f66659de052fb0e759) >>>>> >>>>> -martin >>>>> >>>>> On 28.03.2013 19:48, John Wilkins wrote: >>>>>> >>>>>> Martin, >>>>>> >>>>>> Greg is talking about noout. With Ceph, you can specifically preclude >>>>>> OSDs from being marked out when down to prevent rebalancing--e.g., >>>>>> during upgrades, short-term maintenance, etc. >>>>>> >>>>>> >>>>>> http://ceph.com/docs/master/rados/operations/troubleshooting-osd/#stopping-w-out-rebalancing >>>>>> >>>>>> On Thu, Mar 28, 2013 at 11:12 AM, Martin Mailand <martin@xxxxxxxxxxxx> >>>>>> wrote: >>>>>>> >>>>>>> Hi Greg, >>>>>>> >>>>>>> setting the osd manually out triggered the recovery. >>>>>>> But now it is the question, why is the osd not marked out after 300 >>>>>>> seconds? That's a default cluster, I use the 0.59 build from your >>>>>>> site. >>>>>>> And I didn't change any value, except for the crushmap. >>>>>>> >>>>>>> That's my ceph.conf. >>>>>>> >>>>>>> -martin >>>>>>> >>>>>>> [global] >>>>>>> auth cluster requierd = none >>>>>>> auth service required = none >>>>>>> auth client required = none >>>>>>> # log file = "" >>>>>>> log_max_recent=100 >>>>>>> log_max_new=100 >>>>>>> >>>>>>> [mon] >>>>>>> mon data = /data/mon.$id >>>>>>> [mon.a] >>>>>>> host = store1 >>>>>>> mon addr = 192.168.195.31:6789 >>>>>>> [mon.b] >>>>>>> host = store3 >>>>>>> mon addr = 192.168.195.33:6789 >>>>>>> [mon.c] >>>>>>> host = store5 >>>>>>> mon addr = 192.168.195.35:6789 >>>>>>> [osd] >>>>>>> journal aio = true >>>>>>> osd data = /data/osd.$id >>>>>>> osd mount options btrfs = rw,noatime,nodiratime,autodefrag >>>>>>> osd mkfs options btrfs = -n 32k -l 32k >>>>>>> >>>>>>> [osd.0] >>>>>>> host = store1 >>>>>>> osd journal = /dev/sdg1 >>>>>>> btrfs devs = /dev/sdc >>>>>>> [osd.1] >>>>>>> host = store1 >>>>>>> osd journal = /dev/sdh1 >>>>>>> btrfs devs = /dev/sdd >>>>>>> [osd.2] >>>>>>> host = store1 >>>>>>> osd journal = /dev/sdi1 >>>>>>> btrfs devs = /dev/sde >>>>>>> [osd.3] >>>>>>> host = store1 >>>>>>> osd journal = /dev/sdj1 >>>>>>> btrfs devs = /dev/sdf >>>>>>> [osd.4] >>>>>>> host = store2 >>>>>>> osd journal = /dev/sdg1 >>>>>>> btrfs devs = /dev/sdc >>>>>>> [osd.5] >>>>>>> host = store2 >>>>>>> osd journal = /dev/sdh1 >>>>>>> btrfs devs = /dev/sdd >>>>>>> [osd.6] >>>>>>> host = store2 >>>>>>> osd journal = /dev/sdi1 >>>>>>> btrfs devs = /dev/sde >>>>>>> [osd.7] >>>>>>> host = store2 >>>>>>> osd journal = /dev/sdj1 >>>>>>> btrfs devs = /dev/sdf >>>>>>> [osd.8] >>>>>>> host = store3 >>>>>>> osd journal = /dev/sdg1 >>>>>>> btrfs devs = /dev/sdc >>>>>>> [osd.9] >>>>>>> host = store3 >>>>>>> osd journal = /dev/sdh1 >>>>>>> btrfs devs = /dev/sdd >>>>>>> [osd.10] >>>>>>> host = store3 >>>>>>> osd journal = /dev/sdi1 >>>>>>> btrfs devs = /dev/sde >>>>>>> [osd.11] >>>>>>> host = store3 >>>>>>> osd journal = /dev/sdj1 >>>>>>> btrfs devs = /dev/sdf >>>>>>> [osd.12] >>>>>>> host = store4 >>>>>>> osd journal = /dev/sdg1 >>>>>>> btrfs devs = /dev/sdc >>>>>>> [osd.13] >>>>>>> host = store4 >>>>>>> osd journal = /dev/sdh1 >>>>>>> btrfs devs = /dev/sdd >>>>>>> [osd.14] >>>>>>> host = store4 >>>>>>> osd journal = /dev/sdi1 >>>>>>> btrfs devs = /dev/sde >>>>>>> [osd.15] >>>>>>> host = store4 >>>>>>> osd journal = /dev/sdj1 >>>>>>> btrfs devs = /dev/sdf >>>>>>> [osd.16] >>>>>>> host = store5 >>>>>>> osd journal = /dev/sdg1 >>>>>>> btrfs devs = /dev/sdc >>>>>>> [osd.17] >>>>>>> host = store5 >>>>>>> osd journal = /dev/sdh1 >>>>>>> btrfs devs = /dev/sdd >>>>>>> [osd.18] >>>>>>> host = store5 >>>>>>> osd journal = /dev/sdi1 >>>>>>> btrfs devs = /dev/sde >>>>>>> [osd.19] >>>>>>> host = store5 >>>>>>> osd journal = /dev/sdj1 >>>>>>> btrfs devs = /dev/sdf >>>>>>> [osd.20] >>>>>>> host = store6 >>>>>>> osd journal = /dev/sdg1 >>>>>>> btrfs devs = /dev/sdc >>>>>>> [osd.21] >>>>>>> host = store6 >>>>>>> osd journal = /dev/sdh1 >>>>>>> btrfs devs = /dev/sdd >>>>>>> [osd.22] >>>>>>> host = store6 >>>>>>> osd journal = /dev/sdi1 >>>>>>> btrfs devs = /dev/sde >>>>>>> [osd.23] >>>>>>> host = store6 >>>>>>> osd journal = /dev/sdj1 >>>>>>> btrfs devs = /dev/sdf >>>>>>> >>>>>>> >>>>>>> On 28.03.2013 19:01, Gregory Farnum wrote: >>>>>>>> >>>>>>>> Your crush map looks fine to me. I'm saying that your ceph -s output >>>>>>>> showed the OSD still hadn't been marked out. No data will be >>>>>>>> migrated >>>>>>>> until it's marked out. >>>>>>>> After ten minutes it should have been marked out, but that's based >>>>>>>> on >>>>>>>> a number of factors you have some control over. If you just want a >>>>>>>> quick check of your crush map you can mark it out manually, too. >>>>>>>> -Greg >>>>>>> >>>>>>> _______________________________________________ >>>>>>> ceph-users mailing list >>>>>>> ceph-users@xxxxxxxxxxxxxx >>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>> >>>>>> >>>>>> >>>>>> >> >> >> > -- John Wilkins Senior Technical Writer Intank john.wilkins@xxxxxxxxxxx (415) 425-9599 http://inktank.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com