Martin, Would you mind posting your Ceph configuration file too? I don't see any value set for "mon_host": "" On Thu, Mar 28, 2013 at 1:04 PM, Martin Mailand <martin@xxxxxxxxxxxx> wrote: > Hi Greg, > > the dump from mon.a is attached. > > -martin > > On 28.03.2013 20:55, Gregory Farnum wrote: >> Hmm. The monitor code for checking this all looks good to me. Can you >> go to one of your monitor nodes and dump the config? >> (http://ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=admin%20socket#viewing-a-configuration-at-runtime) >> -Greg >> >> On Thu, Mar 28, 2013 at 12:33 PM, Martin Mailand <martin@xxxxxxxxxxxx> wrote: >>> Hi, >>> >>> I get the same behavior an new created cluster as well, no changes to >>> the cluster config at all. >>> I stop the osd.1, after 20 seconds it got marked down. But it never get >>> marked out. >>> >>> ceph version 0.59 (cbae6a435c62899f857775f66659de052fb0e759) >>> >>> -martin >>> >>> On 28.03.2013 19:48, John Wilkins wrote: >>>> Martin, >>>> >>>> Greg is talking about noout. With Ceph, you can specifically preclude >>>> OSDs from being marked out when down to prevent rebalancing--e.g., >>>> during upgrades, short-term maintenance, etc. >>>> >>>> http://ceph.com/docs/master/rados/operations/troubleshooting-osd/#stopping-w-out-rebalancing >>>> >>>> On Thu, Mar 28, 2013 at 11:12 AM, Martin Mailand <martin@xxxxxxxxxxxx> wrote: >>>>> Hi Greg, >>>>> >>>>> setting the osd manually out triggered the recovery. >>>>> But now it is the question, why is the osd not marked out after 300 >>>>> seconds? That's a default cluster, I use the 0.59 build from your site. >>>>> And I didn't change any value, except for the crushmap. >>>>> >>>>> That's my ceph.conf. >>>>> >>>>> -martin >>>>> >>>>> [global] >>>>> auth cluster requierd = none >>>>> auth service required = none >>>>> auth client required = none >>>>> # log file = "" >>>>> log_max_recent=100 >>>>> log_max_new=100 >>>>> >>>>> [mon] >>>>> mon data = /data/mon.$id >>>>> [mon.a] >>>>> host = store1 >>>>> mon addr = 192.168.195.31:6789 >>>>> [mon.b] >>>>> host = store3 >>>>> mon addr = 192.168.195.33:6789 >>>>> [mon.c] >>>>> host = store5 >>>>> mon addr = 192.168.195.35:6789 >>>>> [osd] >>>>> journal aio = true >>>>> osd data = /data/osd.$id >>>>> osd mount options btrfs = rw,noatime,nodiratime,autodefrag >>>>> osd mkfs options btrfs = -n 32k -l 32k >>>>> >>>>> [osd.0] >>>>> host = store1 >>>>> osd journal = /dev/sdg1 >>>>> btrfs devs = /dev/sdc >>>>> [osd.1] >>>>> host = store1 >>>>> osd journal = /dev/sdh1 >>>>> btrfs devs = /dev/sdd >>>>> [osd.2] >>>>> host = store1 >>>>> osd journal = /dev/sdi1 >>>>> btrfs devs = /dev/sde >>>>> [osd.3] >>>>> host = store1 >>>>> osd journal = /dev/sdj1 >>>>> btrfs devs = /dev/sdf >>>>> [osd.4] >>>>> host = store2 >>>>> osd journal = /dev/sdg1 >>>>> btrfs devs = /dev/sdc >>>>> [osd.5] >>>>> host = store2 >>>>> osd journal = /dev/sdh1 >>>>> btrfs devs = /dev/sdd >>>>> [osd.6] >>>>> host = store2 >>>>> osd journal = /dev/sdi1 >>>>> btrfs devs = /dev/sde >>>>> [osd.7] >>>>> host = store2 >>>>> osd journal = /dev/sdj1 >>>>> btrfs devs = /dev/sdf >>>>> [osd.8] >>>>> host = store3 >>>>> osd journal = /dev/sdg1 >>>>> btrfs devs = /dev/sdc >>>>> [osd.9] >>>>> host = store3 >>>>> osd journal = /dev/sdh1 >>>>> btrfs devs = /dev/sdd >>>>> [osd.10] >>>>> host = store3 >>>>> osd journal = /dev/sdi1 >>>>> btrfs devs = /dev/sde >>>>> [osd.11] >>>>> host = store3 >>>>> osd journal = /dev/sdj1 >>>>> btrfs devs = /dev/sdf >>>>> [osd.12] >>>>> host = store4 >>>>> osd journal = /dev/sdg1 >>>>> btrfs devs = /dev/sdc >>>>> [osd.13] >>>>> host = store4 >>>>> osd journal = /dev/sdh1 >>>>> btrfs devs = /dev/sdd >>>>> [osd.14] >>>>> host = store4 >>>>> osd journal = /dev/sdi1 >>>>> btrfs devs = /dev/sde >>>>> [osd.15] >>>>> host = store4 >>>>> osd journal = /dev/sdj1 >>>>> btrfs devs = /dev/sdf >>>>> [osd.16] >>>>> host = store5 >>>>> osd journal = /dev/sdg1 >>>>> btrfs devs = /dev/sdc >>>>> [osd.17] >>>>> host = store5 >>>>> osd journal = /dev/sdh1 >>>>> btrfs devs = /dev/sdd >>>>> [osd.18] >>>>> host = store5 >>>>> osd journal = /dev/sdi1 >>>>> btrfs devs = /dev/sde >>>>> [osd.19] >>>>> host = store5 >>>>> osd journal = /dev/sdj1 >>>>> btrfs devs = /dev/sdf >>>>> [osd.20] >>>>> host = store6 >>>>> osd journal = /dev/sdg1 >>>>> btrfs devs = /dev/sdc >>>>> [osd.21] >>>>> host = store6 >>>>> osd journal = /dev/sdh1 >>>>> btrfs devs = /dev/sdd >>>>> [osd.22] >>>>> host = store6 >>>>> osd journal = /dev/sdi1 >>>>> btrfs devs = /dev/sde >>>>> [osd.23] >>>>> host = store6 >>>>> osd journal = /dev/sdj1 >>>>> btrfs devs = /dev/sdf >>>>> >>>>> >>>>> On 28.03.2013 19:01, Gregory Farnum wrote: >>>>>> Your crush map looks fine to me. I'm saying that your ceph -s output >>>>>> showed the OSD still hadn't been marked out. No data will be migrated >>>>>> until it's marked out. >>>>>> After ten minutes it should have been marked out, but that's based on >>>>>> a number of factors you have some control over. If you just want a >>>>>> quick check of your crush map you can mark it out manually, too. >>>>>> -Greg >>>>> _______________________________________________ >>>>> ceph-users mailing list >>>>> ceph-users@xxxxxxxxxxxxxx >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>>> >>>> -- John Wilkins Senior Technical Writer Intank john.wilkins@xxxxxxxxxxx (415) 425-9599 http://inktank.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com