Re: HEALTH_ERR 18624 pgs stuck inactive; 18624 pgs stuck unclean; no osds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I believe this is because you specified "hostname" rather than "host" for the OSDs in your ceph.conf. "hostname" isn't a config option that anything in Ceph recognizes. :) 
-Greg


On Wednesday, January 30, 2013 at 8:12 AM, femi anjorin wrote:

> Hi,
> 
> Can anyone help with this?
> 
> I am running a cluster of 6 servers. Each with 16 hard drives. I
> mounted all the hard drives on the recommended mount point
> /var/lib/ceph/osd/ceph-n . look like this:
> /dev/sda1 on /var/lib/ceph/osd/ceph-0
> /dev/sdb1 on /var/lib/ceph/osd/ceph-1
> /dev/sdc1 on /var/lib/ceph/osd/ceph-2
> /dev/sdd1 on /var/lib/ceph/osd/ceph-3
> /dev/sde1 on /var/lib/ceph/osd/ceph-4
> /dev/sdf1 on /var/lib/ceph/osd/ceph-5
> /dev/sdg1 on /var/lib/ceph/osd/ceph-6
> /dev/sdh1 on /var/lib/ceph/osd/ceph-7
> /dev/sdi1 on /var/lib/ceph/osd/ceph-8
> /dev/sdj1 on /var/lib/ceph/osd/ceph-9
> /dev/sdk1 on /var/lib/ceph/osd/ceph-10
> /dev/sdl1 on /var/lib/ceph/osd/ceph-11
> /dev/sdm1 on /var/lib/ceph/osd/ceph-12
> /dev/sdn1 on /var/lib/ceph/osd/ceph-13
> /dev/sdo1 on /var/lib/ceph/osd/ceph-14
> /dev/sdp1 on /var/lib/ceph/osd/ceph-15
> 
> 
> Below is a summarized copy of my ceph.conf file. Since i have 16
> drive on each server ...so i did a configuration of osd.0 - osd.95.
> While I did configuration of 3 monitors and 1 mds server.
> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
> [global]
> auth cluster required = cephx
> auth service required = cephx
> auth client required = cephx
> debug ms = 1
> [osd]
> osd journal size = 10000
> filestore xattr use omap = true
> 
> [osd.0]
> hostname = testserver109
> devs = /dev/sda1
> [osd.1]
> hostname = testserver109
> devs = /dev/sdb1
> .
> .
> .
> [osd.16]
> hostname = testserver110
> devs = /dev/sda1
> .
> .
> [osd.95]
> hostname = testserver114
> devs = /dev/sdp1
> 
> [mon]
> mon data = /var/lib/ceph/mon/$cluster-$id
> 
> [mon.a]
> host = testserver109
> mon addr = 172.16.1.9:6789
> 
> [mon.b]
> host = testserver110
> mon addr = 172.16.1.10:6789
> 
> [mon.c]
> host = testserver111
> mon addr = 172.16.1.11:6789
> [mds.a]
> host = testserver025
> 
> [mon]
> debug mon = 20
> debug paxos = 20
> debug auth = 20
> 
> [osd]
> debug osd = 20
> debug filestore = 20
> debug journal = 20
> debug monc = 20
> 
> [mds]
> debug mds = 20
> debug mds balancer = 20
> debug mds log = 20
> debug mds migrator = 20
> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
> 
> Steps:
> 1. I did mkcephfs -a -c /etc/ceph/ceph.conf -k ceph.keyring
> temp dir is /tmp/mkcephfs.G5cBEIaS1o
> preparing monmap in /tmp/mkcephfs.G5cBEIaS1o/monmap
> /usr/bin/monmaptool --create --clobber --add a 172.16.1.9:6789 --add b
> 172.16.1.10:6789 --add c 172.16.1.11:6789 --print
> /tmp/mkcephfs.G5cBEIaS1o/monmap
> /usr/bin/monmaptool: monmap file /tmp/mkcephfs.G5cBEIaS1o/monmap
> /usr/bin/monmaptool: generated fsid 3dd34cbf-e228-4ced-850c-68cde0a7d8b5
> epoch 0
> fsid 3dd34cbf-e228-4ced-850c-68cde0a7d8b5
> last_changed 2013-01-30 12:38:14.564735
> created 2013-01-30 12:38:14.564735
> 0: 172.16.1.9:6789/0 mon.a
> 1: 172.16.1.10:6789/0 mon.b
> 2: 172.16.1.11:6789/0 mon.c
> /usr/bin/monmaptool: writing epoch 0 to
> /tmp/mkcephfs.G5cBEIaS1o/monmap (3 monitors)
> === mds.a ===
> creating private key for mds.a keyring /var/lib/ceph/mds/ceph-a/keyring
> creating /var/lib/ceph/mds/ceph-a/keyring
> Building generic osdmap from /tmp/mkcephfs.G5cBEIaS1o/conf
> /usr/bin/osdmaptool: osdmap file '/tmp/mkcephfs.G5cBEIaS1o/osdmap'
> /usr/bin/osdmaptool: writing epoch 1 to /tmp/mkcephfs.G5cBEIaS1o/osdmap
> Generating admin key at /tmp/mkcephfs.G5cBEIaS1o/keyring.admin
> creating /tmp/mkcephfs.G5cBEIaS1o/keyring.admin
> Building initial monitor keyring
> added entity mds.a auth auth(auid = 18446744073709551615
> key=AQAnBglRaGP7MxAANo/xsy5P9NxMzCZGmHQDCw== with 0 caps)
> === mon.a ===
> pushing everything to testserver109
> /usr/bin/ceph-mon: created monfs at /var/lib/ceph/mon/ceph-a for mon.a
> === mon.b ===
> pushing everything to testserver110
> /usr/bin/ceph-mon: created monfs at /var/lib/ceph/mon/ceph-b for mon.b
> === mon.c ===
> pushing everything to testserver111
> /usr/bin/ceph-mon: created monfs at /var/lib/ceph/mon/ceph-c for mon.c
> placing client.admin keyring in ceph.keyring
> 
> ---------------------------------------------------------------------------------------------------------------------------------------
> Apparently the monitor and mds got created and the ceph.keyring was
> created BUT the OSDs were not created.
> 
> ----------------------------------------------------------------------------------------------------------------------------------------
> 2. I copied to ceph.keyring to all node
> 3. I did a "service ceph -a start" command (on all node)
> 4. I did a "ceph health" (on the node where i used the mkcephfs)
> 
> 2013-01-30 13:12:18.822022 7f80ea476760 1 -- :/0 messenger.start
> 2013-01-30 13:12:18.822911 7f80ea476760 1 -- :/3458 -->
> 172.16.1.9:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- ?+0
> 0x131aae0 con 0x131a700
> 2013-01-30 13:12:18.823439 7f80ea474700 1 -- 172.16.0.25:0/3458
> learned my addr 172.16.0.25:0/3458
> 2013-01-30 13:12:18.824574 7f80dd7bb700 1 -- 172.16.0.25:0/3458 <==
> mon.0 172.16.1.9:6789/0 1 ==== mon_map v1 ==== 473+0+0 (3454127086 0
> 0) 0x7f80d0000b10 con 0x131a700
> 2013-01-30 13:12:18.824687 7f80dd7bb700 1 -- 172.16.0.25:0/3458 <==
> mon.0 172.16.1.9:6789/0 2 ==== auth_reply(proto 2 0 Success) v1 ====
> 33+0+0 (3089139024 0 0) 0x7f80d0000eb0 con 0x131a700
> 2013-01-30 13:12:18.824847 7f80dd7bb700 1 -- 172.16.0.25:0/3458 -->
> 172.16.1.9:6789/0 -- auth(proto 2 32 bytes epoch 0) v1 -- ?+0
> 0x7f80d4001620 con 0x131a700
> 2013-01-30 13:12:18.826010 7f80dd7bb700 1 -- 172.16.0.25:0/3458 <==
> mon.0 172.16.1.9:6789/0 3 ==== auth_reply(proto 2 0 Success) v1 ====
> 206+0+0 (3859488439 0 0) 0x7f80d0000eb0 con 0x131a700
> 2013-01-30 13:12:18.826130 7f80dd7bb700 1 -- 172.16.0.25:0/3458 -->
> 172.16.1.9:6789/0 -- auth(proto 2 165 bytes epoch 0) v1 -- ?+0
> 0x7f80d4003720 con 0x131a700
> 2013-01-30 13:12:18.827557 7f80dd7bb700 1 -- 172.16.0.25:0/3458 <==
> mon.0 172.16.1.9:6789/0 4 ==== auth_reply(proto 2 0 Success) v1 ====
> 409+0+0 (4218726993 0 0) 0x7f80d0000eb0 con 0x131a700
> 2013-01-30 13:12:18.827654 7f80dd7bb700 1 -- 172.16.0.25:0/3458 -->
> 172.16.1.9:6789/0 -- mon_subscribe({monmap=0+}) v2 -- ?+0 0x131adc0
> con 0x131a700
> 2013-01-30 13:12:18.827715 7f80ea476760 1 -- 172.16.0.25:0/3458 -->
> 172.16.1.9:6789/0 -- mon_command(health v 0) v1 -- ?+0 0x13188d0 con
> 0x131a700
> 2013-01-30 13:12:18.828343 7f80dd7bb700 1 -- 172.16.0.25:0/3458 <==
> mon.0 172.16.1.9:6789/0 5 ==== mon_map v1 ==== 473+0+0 (3454127086 0
> 0) 0x7f80d00010e0 con 0x131a700
> 2013-01-30 13:12:18.828394 7f80dd7bb700 1 -- 172.16.0.25:0/3458 <==
> mon.0 172.16.1.9:6789/0 6 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0
> (3529768468 0 0) 0x7f80d00012c0 con 0x131a700
> HEALTH_ERR 18624 pgs stuck inactive; 18624 pgs stuck unclean; no osds
> 2013-01-30 13:12:18.906689 7f80dd7bb700 1 -- 172.16.0.25:0/3458 <==
> mon.0 172.16.1.9:6789/0 7 ==== mon_command_ack([health]=0 HEALTH_ERR
> 18624 pgs stuck inactive; 18624 pgs stuck unclean; no osds v0) v1 ====
> 109+0+0 (1820397562 0 0) 0x7f80d0000eb0 con 0x131a700
> 2013-01-30 13:12:18.906749 7f80ea476760 1 -- 172.16.0.25:0/3458 mark_down_all
> 2013-01-30 13:12:18.906826 7f80ea476760 1 -- 172.16.0.25:0/3458
> shutdown complete.
> 
> ----------------------------------------------------------------------------------------------------------------------------------
> Issues:
> " HEALTH_ERR 18624 pgs stuck inactive; 18624 pgs stuck unclean; no osds"
> "mark_down_all"
> "shutdown complete"
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx (mailto:majordomo@xxxxxxxxxxxxxxx)
> More majordomo info at http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux