OSDMap problem: osd does not exist.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

recently I tried to upgrade from 0.57 to 0.67.3, hit the changes
of mon protocol, and so I updated all of the 5 mons.
After upgrading the mon, (and during the debugging of other problems,)
I removed and created the mon filesystem from scratch.
OSDMap and Crushmap seems to be blown by this;
I expected at that time that they are easily recovered.

Now, I tried to recover but couldn't make osdmap remember any OSDs.
Would somebody help me, please ?

osd dump shows:
% ceph osd dump
epoch 5
fsid 6ef8e7bb-745c-4e20-8946-8808b9843380
created 2013-09-11 18:36:36.731905
modified 2013-09-12 00:53:46.375623
flags 

pool 0 'data' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0 crash_replay_interval 45
pool 1 'metadata' rep size 2 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0
pool 2 'rbd' rep size 2 min_size 1 crush_ruleset 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0

max_osd 16

(nothing further)
% 

It seems that crushmap is set correctly (attached in the end of this e-mail).

Although I turned off authx temporarily, I set osds in the ceph auth list
(manually by ceph auth add command).
installed auth entries:

osd.0
        key: AQCMcvZQIA6XNRAAbVshyOevLfNUI0SIkVEAzQ==
        caps: [mon] allow rwx
        caps: [osd] allow *
osd.1
        key: AQB/c/ZQKAf8OBAAIs9sHZlAjNoIn7eE2bbggg==
        caps: [mon] allow rwx
        caps: [osd] allow *
   :
   :

# ceph osd in 3
osd.3 does not exist. 

# /etc/init.d/ceph -a start osd.3
=== osd.3 === 
Mounting Btrfs on host-4:/ceph/data/osd-disk
Scanning for Btrfs filesystems
Error ENOENT: osd.3 does not exist.  create it before updating the crush map
Starting Ceph osd.3 on host-4...
starting osd.3 at :/0 osd_data /ceph/data/osd-disk /ceph/data/osd-journal

# /ceph/bin/ceph status
  cluster 6ef8e7bb-745c-4e20-8946-8808b9843380
   health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds
   monmap e1: 5 mons at {0=A.B.C.D:6789/0,1=A.B.C.D:6789/0,2=A.B.C.D:6789/0,3=A.B.C.D:6789/0,4=A.B.C.D:6789/0}, election epoch 22, quorum 0,1,2,3,4 0,1,2,3,4
   osdmap e5: 0 osds: 0 up, 0 in
    pgmap v6: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 KB avail
   mdsmap e7: 1/1/1 up {0=1=up:creating}, 4 up:standby


How can I recover the osd map ?

Thanks !
Yasu

(Crushmap follows.)

# begin crush map

# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 osd.5
device 6 osd.6
device 7 osd.7
device 8 osd.8
device 9 osd.9
device 10 osd.10
device 11 osd.11
device 12 osd.12
device 13 osd.13
device 14 osd.14
device 15 osd.15

# types
type 0 device
type 1 host
type 2 rack
type 3 row
type 4 room
type 5 datacenter
type 6 root

# buckets
host host-1 {
	id -1		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.0 weight 1.000
}
host host-2 {
	id -2		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.1 weight 1.000
}
host host-3 {
	id -3		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.2 weight 1.000
}
host host-4 {
	id -4		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.3 weight 1.000
}
host host-5 {
	id -5		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.4 weight 1.000
}
host host-6 {
	id -6		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.5 weight 1.000
}
host host-7 {
	id -7		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.6 weight 1.000
}
host host-8 {
	id -8		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.7 weight 1.000
}
host host-9 {
	id -9		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.8 weight 1.000
}
host host-10 {
	id -10		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.9 weight 1.000
}
host host-11 {
	id -11		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.10 weight 1.000
}
host host-12 {
	id -12		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.11 weight 1.000
}
host host-13 {
	id -13		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.12 weight 1.000
}
host host-14 {
	id -14		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.13 weight 1.000
}
host host-15 {
	id -15		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.14 weight 1.000
}
host host-16 {
	id -16		# do not change unnecessarily
	# weight 1.000
	alg straw
	hash 0	# rjenkins1
	item osd.15 weight 1.000
}
rack rack0 {
	id -17		# do not change unnecessarily
	# weight 16.000
	alg straw
	hash 0	# rjenkins1
	item host-1 weight 1.000
	item host-2 weight 1.000
	item host-3 weight 1.000
	item host-4 weight 1.000
	item host-5 weight 1.000
	item host-6 weight 1.000
	item host-7 weight 1.000
	item host-8 weight 1.000
	item host-9 weight 1.000
	item host-10 weight 1.000
	item host-11 weight 1.000
	item host-12 weight 1.000
	item host-13 weight 1.000
	item host-14 weight 1.000
	item host-15 weight 1.000
	item host-16 weight 1.000
}
row row0 {
	id -18		# do not change unnecessarily
	# weight 16.000
	alg straw
	hash 0	# rjenkins1
	item rack0 weight 16.000
}
room room0 {
	id -19		# do not change unnecessarily
	# weight 16.000
	alg straw
	hash 0	# rjenkins1
	item row0 weight 16.000
}
datacenter datacenter0 {
	id -20		# do not change unnecessarily
	# weight 16.000
	alg straw
	hash 0	# rjenkins1
	item room0 weight 16.000
}
root default {
	id -21		# do not change unnecessarily
	# weight 16.000
	alg straw
	hash 0	# rjenkins1
	item datacenter0 weight 16.000
}

# rules
rule data {
	ruleset 0
	type replicated
	min_size 1
	max_size 10
	step take default
	step chooseleaf firstn 0 type host
	step emit
}
rule metadata {
	ruleset 1
	type replicated
	min_size 1
	max_size 10
	step take default
	step chooseleaf firstn 0 type host
	step emit
}
rule rbd {
	ruleset 2
	type replicated
	min_size 1
	max_size 10
	step take default
	step chooseleaf firstn 0 type host
	step emit
}

# end crush map
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux