Hi Goncalo! Thanks for the explanation of the pg identifier. That helped a lot. I thought XX in pg XX.YYY was an OSD, not a pool. So pool 19 was the pool I created yesterday in an attempt to replace the broken 'volumes' pool (6}. I just deleted the new pool and all the 19.XXX unclean messages went away. So I guess that's another symptom. I'm unable to create new pools. Now that I knew the first part of the pg number identified the pool, I did a complete ‘ceph health detail’ (without the ‘unlean’ grep I used before) and found that I have problems (stuck stale) with pools 6, 7, 8, 9, 15, 16, and 17. These, I was told, have been around since the installation. They may, or may not be related to my inability to access the volumes/images pools. root@CTR01:~# ceph health detail | grep -v -e 'HEALTH_WARN' -e 'too many PGs'| awk '{print $2}' | awk -F'.' '{print $1}' | sort | uniq -c 139 15 153 16 149 17 263 6 261 7 298 8 275 9 Here is a reference to the pool numbers: root@CTR01:~# ceph osd lspools 0 rbd,1 .rgw.root,2 .rgw.control,3 .rgw,4 .rgw.gc,5 .users.uid,6 volumes,7 images,8 backups,9 vms,10 pool-name,11 object-backups,12 .users,13 .users.swift,14 .rgw.buckets,15 .rgw.buckets.index,16 .rgw.buckets.extra,17 .rgw.buckets.backups, All of them with health details warnings had this message: pg 9.181 is stuck stale for 23290188.497474, current state stale+active+clean, last acting [17,24,52] pg 7.180 is stuck stale for 23293255.556745, current state stale+active+clean, last acting [24,46,15] I believe all these messages were being displayed before things stopped working (which may or may not be a valid assumption). I am unable to perform a ‘'ceph pg xx.xxx query' on any of the ‘stale+actve+clean’ PGs. It hung for every PG I tried to query (I couldn’t loop it because the queries never finished – even with the timeout option). Finally, I did a ‘ceph pg map’ on all of the PGs and received identical responses from each: osdmap e1326 pg 9.184 (9.184) -> up [] acting [] ceph pg map 6.18a osdmap e1326 pg 6.18a (6.18a) -> up [] acting [] ceph pg map 15.183 osdmap e1326 pg 15.183 (15.183) -> up [] acting [] ceph pg map 7.18c osdmap e1326 pg 7.18c (7.18c) -> up [] acting [] ceph pg map 9.183 osdmap e1326 pg 9.183 (9.183) -> up [] acting [] ceph pg map 9.181 osdmap e1326 pg 9.181 (9.181) -> up [] acting [] As far as the osd logs, I could only find one error on one osd across 6 hosts (each with 12 disks): Ceph06:~$ grep -iHn 'ERR' /var/log/ceph/ceph-*.log /var/log/ceph/ceph-osd.54.log:15:2016-11-12 07:06:24.433881 7f70b288a700 0 <cls> cls/rgw/cls_rgw.cc:1976: ERROR: rgw_obj_remove(): cls_cxx_remove returned -2 /var/log/ceph/ceph-osd.54.log:44:2016-11-12 07:36:23.801908 7f70b288a700 0 <cls> cls/rgw/cls_rgw.cc:1976: ERROR: rgw_obj_remove(): cls_cxx_remove returned -2 /var/log/ceph/ceph-osd.54.log:68:2016-11-12 08:06:23.959920 7f70b288a700 0 <cls> cls/rgw/cls_rgw.cc:1976: ERROR: rgw_obj_remove(): cls_cxx_remove returned -2 /var/log/ceph/ceph-osd.54.log:79:2016-11-12 08:36:24.399491 7f70b288a700 0 <cls> cls/rgw/cls_rgw.cc:1976: ERROR: rgw_obj_remove(): cls_cxx_remove returned -2 I spent hours Googling this yesterday without finding anything applicable. Not sure if this is related to the problem or not. I’ve restarted the host several times and the errors has persisted. Finally, here is the crushmap (and ‘ceph osd tree’ output). During the upgrades, we replaced the drive on osd 71. The new osd became 72. I was told that ceph03 was added later and that’s why it’s in the wrong place. CTR01:~# cat crushmap # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable chooseleaf_vary_r 1 tunable straw_calc_version 1 tunable allowed_bucket_algs 54 # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 device 3 osd.3 device 4 osd.4 device 5 osd.5 device 6 osd.6 device 7 osd.7 device 8 osd.8 device 9 osd.9 device 10 osd.10 device 11 osd.11 device 12 osd.12 device 13 osd.13 device 14 osd.14 device 15 osd.15 device 16 osd.16 device 17 osd.17 device 18 osd.18 device 19 osd.19 device 20 osd.20 device 21 osd.21 device 22 osd.22 device 23 osd.23 device 24 osd.24 device 25 osd.25 device 26 osd.26 device 27 osd.27 device 28 osd.28 device 29 osd.29 device 30 osd.30 device 31 osd.31 device 32 osd.32 device 33 osd.33 device 34 osd.34 device 35 osd.35 device 36 osd.36 device 37 osd.37 device 38 osd.38 device 39 osd.39 device 40 osd.40 device 41 osd.41 device 42 osd.42 device 43 osd.43 device 44 osd.44 device 45 osd.45 device 46 osd.46 device 47 osd.47 device 48 osd.48 device 49 osd.49 device 50 osd.50 device 51 osd.51 device 52 osd.52 device 53 osd.53 device 54 osd.54 device 55 osd.55 device 56 osd.56 device 57 osd.57 device 58 osd.58 device 59 osd.59 device 60 osd.60 device 61 osd.61 device 62 osd.62 device 63 osd.63 device 64 osd.64 device 65 osd.65 device 66 osd.66 device 67 osd.67 device 68 osd.68 device 69 osd.69 device 70 osd.70 device 71 device71 device 72 osd.72 # types type 0 osd type 1 host type 2 chassis type 3 rack type 4 row type 5 pdu type 6 pod type 7 room type 8 datacenter type 9 region type 10 root # buckets host Ceph01 { id -2 # do not change unnecessarily # weight 21.840 alg straw hash 0 # rjenkins1 item osd.0 weight 1.820 item osd.1 weight 1.820 item osd.2 weight 1.820 item osd.3 weight 1.820 item osd.4 weight 1.820 item osd.5 weight 1.820 item osd.6 weight 1.820 item osd.7 weight 1.820 item osd.8 weight 1.820 item osd.9 weight 1.820 item osd.10 weight 1.820 item osd.11 weight 1.820 } host Ceph02 { id -3 # do not change unnecessarily # weight 21.840 alg straw hash 0 # rjenkins1 item osd.12 weight 1.820 item osd.13 weight 1.820 item osd.14 weight 1.820 item osd.15 weight 1.820 item osd.16 weight 1.820 item osd.17 weight 1.820 item osd.18 weight 1.820 item osd.19 weight 1.820 item osd.20 weight 1.820 item osd.21 weight 1.820 item osd.22 weight 1.820 item osd.23 weight 1.820 } rack Rack01 { id -8 # do not change unnecessarily # weight 43.680 alg straw hash 0 # rjenkins1 item Ceph01 weight 21.840 item Ceph02 weight 21.840 } host Ceph03 { id -7 # do not change unnecessarily # weight 21.840 alg straw2 hash 0 # rjenkins1 item osd.60 weight 1.820 item osd.61 weight 1.820 item osd.62 weight 1.820 item osd.63 weight 1.820 item osd.64 weight 1.820 item osd.65 weight 1.820 item osd.66 weight 1.820 item osd.67 weight 1.820 item osd.68 weight 1.820 item osd.69 weight 1.820 item osd.70 weight 1.820 item osd.72 weight 1.820 } host Ceph04 { id -4 # do not change unnecessarily # weight 21.840 alg straw hash 0 # rjenkins1 item osd.24 weight 1.820 item osd.25 weight 1.820 item osd.26 weight 1.820 item osd.27 weight 1.820 item osd.28 weight 1.820 item osd.29 weight 1.820 item osd.30 weight 1.820 item osd.31 weight 1.820 item osd.32 weight 1.820 item osd.33 weight 1.820 item osd.34 weight 1.820 item osd.35 weight 1.820 } rack Rack02 { id -9 # do not change unnecessarily # weight 43.680 alg straw hash 0 # rjenkins1 item Ceph03 weight 21.840 item Ceph04 weight 21.840 } host Ceph05 { id -5 # do not change unnecessarily # weight 21.840 alg straw hash 0 # rjenkins1 item osd.36 weight 1.820 item osd.37 weight 1.820 item osd.38 weight 1.820 item osd.39 weight 1.820 item osd.40 weight 1.820 item osd.41 weight 1.820 item osd.42 weight 1.820 item osd.43 weight 1.820 item osd.44 weight 1.820 item osd.45 weight 1.820 item osd.46 weight 1.820 item osd.47 weight 1.820 } host Ceph06 { id -6 # do not change unnecessarily # weight 21.840 alg straw hash 0 # rjenkins1 item osd.48 weight 1.820 item osd.49 weight 1.820 item osd.50 weight 1.820 item osd.51 weight 1.820 item osd.52 weight 1.820 item osd.53 weight 1.820 item osd.54 weight 1.820 item osd.55 weight 1.820 item osd.56 weight 1.820 item osd.57 weight 1.820 item osd.58 weight 1.820 item osd.59 weight 1.820 } rack Rack03 { id -10 # do not change unnecessarily # weight 43.680 alg straw hash 0 # rjenkins1 item Ceph05 weight 21.840 item Ceph06 weight 21.840 } datacenter MUC1 { id -11 # do not change unnecessarily # weight 131.040 alg straw hash 0 # rjenkins1 item Rack01 weight 43.680 item Rack02 weight 43.680 item Rack03 weight 43.680 } root default { id -1 # do not change unnecessarily # weight 152.880 alg straw hash 0 # rjenkins1 item MUC1 weight 131.040 item Ceph03 weight 21.840 } # rules rule replicated_ruleset { ruleset 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit } rule custom_ruleset { ruleset 1 type replicated min_size 1 max_size 10 step take default step choose firstn 1 type datacenter step choose firstn 3 type rack step chooseleaf firstn 1 type host step emit } # end crush map CTR01:~# cat tree.txt ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 152.87979 root default -11 131.03989 datacenter MUC1 -8 43.67999 rack MUC1-Rack01 -2 21.84000 host MUC1-Ceph01 0 1.81999 osd.0 up 1.00000 1.00000 1 1.81999 osd.1 up 1.00000 1.00000 2 1.81999 osd.2 up 1.00000 1.00000 3 1.81999 osd.3 up 1.00000 1.00000 4 1.81999 osd.4 up 1.00000 1.00000 5 1.81999 osd.5 up 1.00000 1.00000 6 1.81999 osd.6 up 1.00000 1.00000 7 1.81999 osd.7 up 1.00000 1.00000 8 1.81999 osd.8 up 1.00000 1.00000 9 1.81999 osd.9 up 1.00000 1.00000 10 1.81999 osd.10 up 1.00000 1.00000 11 1.81999 osd.11 up 1.00000 1.00000 -3 21.84000 host MUC1-Ceph02 12 1.81999 osd.12 up 1.00000 1.00000 13 1.81999 osd.13 up 1.00000 1.00000 14 1.81999 osd.14 up 1.00000 1.00000 15 1.81999 osd.15 up 1.00000 1.00000 16 1.81999 osd.16 up 1.00000 1.00000 17 1.81999 osd.17 up 1.00000 1.00000 18 1.81999 osd.18 up 1.00000 1.00000 19 1.81999 osd.19 up 1.00000 1.00000 20 1.81999 osd.20 up 1.00000 1.00000 21 1.81999 osd.21 up 1.00000 1.00000 22 1.81999 osd.22 up 1.00000 1.00000 23 1.81999 osd.23 up 1.00000 1.00000 -9 43.67990 rack MUC1-Rack02 -7 21.83990 host MUC1-Ceph03 60 1.81999 osd.60 up 1.00000 1.00000 61 1.81999 osd.61 up 1.00000 1.00000 62 1.81999 osd.62 up 1.00000 1.00000 63 1.81999 osd.63 up 1.00000 1.00000 64 1.81999 osd.64 up 1.00000 1.00000 65 1.81999 osd.65 up 1.00000 1.00000 66 1.81999 osd.66 up 1.00000 1.00000 67 1.81999 osd.67 up 1.00000 1.00000 68 1.81999 osd.68 up 1.00000 1.00000 69 1.81999 osd.69 up 1.00000 1.00000 70 1.81999 osd.70 up 1.00000 1.00000 72 1.81999 osd.72 up 1.00000 1.00000 -4 21.84000 host MUC1-Ceph04 24 1.81999 osd.24 up 1.00000 1.00000 25 1.81999 osd.25 up 1.00000 1.00000 26 1.81999 osd.26 up 1.00000 1.00000 27 1.81999 osd.27 up 1.00000 1.00000 28 1.81999 osd.28 up 1.00000 1.00000 29 1.81999 osd.29 up 1.00000 1.00000 30 1.81999 osd.30 up 1.00000 1.00000 31 1.81999 osd.31 up 1.00000 1.00000 32 1.81999 osd.32 up 1.00000 1.00000 33 1.81999 osd.33 up 1.00000 1.00000 34 1.81999 osd.34 up 1.00000 1.00000 35 1.81999 osd.35 up 1.00000 1.00000 -10 43.67999 rack MUC1-Rack03 -5 21.84000 host MUC1-Ceph05 36 1.81999 osd.36 up 1.00000 1.00000 37 1.81999 osd.37 up 1.00000 1.00000 38 1.81999 osd.38 up 1.00000 1.00000 39 1.81999 osd.39 up 1.00000 1.00000 40 1.81999 osd.40 up 1.00000 1.00000 41 1.81999 osd.41 up 1.00000 1.00000 42 1.81999 osd.42 up 1.00000 1.00000 43 1.81999 osd.43 up 1.00000 1.00000 44 1.81999 osd.44 up 1.00000 1.00000 45 1.81999 osd.45 up 1.00000 1.00000 46 1.81999 osd.46 up 1.00000 1.00000 47 1.81999 osd.47 up 1.00000 1.00000 -6 21.84000 host MUC1-Ceph06 48 1.81999 osd.48 up 1.00000 1.00000 49 1.81999 osd.49 up 1.00000 1.00000 50 1.81999 osd.50 up 1.00000 1.00000 51 1.81999 osd.51 up 1.00000 1.00000 52 1.81999 osd.52 up 1.00000 1.00000 53 1.81999 osd.53 up 1.00000 1.00000 54 1.81999 osd.54 up 1.00000 1.00000 55 1.81999 osd.55 up 1.00000 1.00000 56 1.81999 osd.56 up 1.00000 1.00000 57 1.81999 osd.57 up 1.00000 1.00000 58 1.81999 osd.58 up 1.00000 1.00000 59 1.81999 osd.59 up 1.00000 1.00000 -7 21.83990 host MUC1-Ceph03 60 1.81999 osd.60 up 1.00000 1.00000 61 1.81999 osd.61 up 1.00000 1.00000 62 1.81999 osd.62 up 1.00000 1.00000 63 1.81999 osd.63 up 1.00000 1.00000 64 1.81999 osd.64 up 1.00000 1.00000 65 1.81999 osd.65 up 1.00000 1.00000 66 1.81999 osd.66 up 1.00000 1.00000 67 1.81999 osd.67 up 1.00000 1.00000 68 1.81999 osd.68 up 1.00000 1.00000 69 1.81999 osd.69 up 1.00000 1.00000 70 1.81999 osd.70 up 1.00000 1.00000 72 1.81999 osd.72 up 1.00000 1.00000 From: Goncalo Borges Hi Joel. The pgs of a given pool start with the id of the pool. So, the 19.xx mean that those pgs are from pool 19. I think that a 'ceph osd dump' should give you a summary of all pools and their ids at the very beginning of the output. My guess is that this will confirm that your volume or image pool has ID 19. The next step is to do a 'ceph pg 19.xx query' and see what it spits out. That can give you some info what is going on with the pg. I would first focus on the creating pgs. Try to understand which osd are involved for one of those pgs ( the query should probably tell you that info). Check their logs. Sometimes it is just sufficient to restart one of those osd for the pg to be created. Also check http://docs.ceph.com/docs/jewel/rados/troubleshooting/troubleshooting-pg/ Cheers Goncalo ________________________________________ From: ceph-users [ceph-users-bounces@xxxxxxxxxxxxxx] on behalf of Joel Griffiths [joel.griffiths@xxxxxxxxxxxxxxxx] Sent: 12 November 2016 12:38 To: ceph-users@xxxxxxxxxxxxxx Subject: stuck unclean since forever I've been struggling with a broken ceph node and I have very limited ceph knowledge. With 3-4 days of actually using it, I was tasked with upgrading it. Everything seemed to go fine, at first, but it didn't last. The next day I was informed people were unable to create volumes (we successfully created a volume immediately after the upgrade, but we were unable to do so now) After some investigation, I discovered that 'rados -p volumes ls' just hangs. I have another pool that behaves that way too (images). The rest don't seem to have any issues. We are running 6 ceph servers with 72 OSD's. Here is what ceph status brings up (now): root@CTR01:~# ceph -s cluster c14740db-4771-4f95-8268-689bba5598eb health HEALTH_WARN 1538 pgs stale 282 pgs stuck inactive 1538 pgs stuck stale 282 pgs stuck unclean too many PGs per OSD (747 > max 300) monmap e1: 3 mons at {Ceph02=192.168.0.12:6789/0,ceph04=192.168.90.14:6789/0,Ceph06=192.168.0.16:6789/0<http://192.168.0.12:6789/0,ceph04=192.168.90.14:6789/0,Ceph06=192.168.0.16:6789/0>} election epoch 3066, quorum 0,1,2 Ceph02,Ceph04,Ceph06 osdmap e1325: 72 osds: 72 up, 72 in pgmap v2515322: 18232 pgs, 19 pools, 1042 GB data, 437 kobjects 3143 GB used, 127 TB / 130 TB avail 16412 active+clean 1538 stale+active+clean 282 creating Some notes: 1538 stale+active+clean - Most of these (1250, 1350, or so) were leftover from the initial installation. They weren't actually being used by the system. I inherited the system with those and was told nobody knew how to get rid of them. It was, apparently, part of a ceph false-start. 282 creating - While I was looking at the issue, I noticed a 'ceph -s' warning about another pool (one we use for swift). It complained about too few PG's per osd, so I increased pg+num and pgp_num from 1024 to 2048; I was hoping the two problems were related.. That's what added the status line 'creating" I think (also all in the 19.xx - is that osd.19?) root@ CTR01:~# ceph health detail | grep unclean HEALTH_WARN 1538 pgs stale; 282 pgs stuck inactive; 1538 pgs stuck stale; 282 pgs stuck unclean; too many PGs per OSD (747 > max 300) pg 19.5b1 is stuck unclean since forever, current state creating, last acting [] pg 19.c5 is stuck unclean since forever, current state creating, last acting [] pg 19.c6 is stuck unclean since forever, current state creating, last acting [] pg 19.c0 is stuck unclean since forever, current state creating, last acting [] pg 19.c2 is stuck unclean since forever, current state creating, last acting [] pg 19.726 is stuck unclean since forever, current state creating, last acting [] pg 19.727 is stuck unclean since forever, current state creating, last acting [] pg 19.412 is stuck unclean since forever, current state creating, last acting [] . . . pg 19.26c is stuck unclean since forever, current state creating, last acting [] pg 19.5be is stuck unclean since forever, current state creating, last acting [] pg 19.264 is stuck unclean since forever, current state creating, last acting [] pg 19.5b4 is stuck unclean since forever, current state creating, last acting [] pg 19.260 is stuck unclean since forever, current state creating, last acting [] Looking at osd.19 logs, I get the same messages I get with osd.20.log: root@Ceph02:~# tail -10 /var/log/ceph/ceph-osd.19.log 2016-11-12 02:18:36.047803 7f973fe58700 0 -- 192.168.92.12:6818/4289<http://192.168.92.12:6818/4289> >> 192.168.92.195:6814/4099<http://192.168.92.195:6814/4099> pipe(0xc057000 sd=87 :57536 s=2 pgs=1039 cs=21 l=0 c=0xa3a34a0).fault with nothing to send, going to standby 2016-11-12 02:22:49.242045 7f974045e700 0 -- 192.168.92.12:6818/4289<http://192.168.92.12:6818/4289> >> 192.168.92.193:6812/4067<http://192.168.92.193:6812/4067> pipe(0xa402000 sd=25 :48529 s=2 pgs=986 cs=21 l=0 c=0xa3a5b20).fault with nothing to send, going to standby 2016-11-12 02:22:49.244093 7f973e741700 0 -- 192.168.92.12:6818/4289<http://192.168.92.12:6818/4289> >> 192.168.92.196:6810/4118<http://192.168.92.196:6810/4118> pipe(0xba4e000 sd=51 :50137 s=2 pgs=933 cs=35 l=0 c=0xb7af760).fault with nothing to send, going to standby 2016-11-12 02:25:20.699763 7f97383e5700 0 -- 192.168.92.12:6818/4289<http://192.168.92.12:6818/4289> >> 192.168.92.194:6806/4108<http://192.168.92.194:6806/4108> pipe(0xba76000 sd=134 :6818 s=2 pgs=972 cs=21 l=0 c=0xb7afb80).fault with nothing to send, going to standby 2016-11-12 02:28:02.526393 7f9720669700 0 -- 192.168.92.12:6818/4289<http://192.168.92.12:6818/4289> >> 192.168.92.193:6806/3964<http://192.168.92.193:6806/3964> pipe(0xbb54000 sd=210 :6818 s=0 pgs=0 cs=0 l=0 c=0xc5bc840).accept connect_seq 41 vs existing 41 state standby 2016-11-12 02:28:02.526750 7f9720669700 0 -- 192.168.92.12:6818/4289<http://192.168.92.12:6818/4289> >> 192.168.92.193:6806/3964<http://192.168.92.193:6806/3964> pipe(0xbb54000 sd=210 :6818 s=0 pgs=0 cs=0 l=0 c=0xc5bc840).accept connect_seq 42 vs existing 41 state standby 2016-11-12 02:33:40.838728 7f973d933700 0 -- 192.168.92.12:6818/4289<http://192.168.92.12:6818/4289> >> 192.168.92.193:6822/4147<http://192.168.92.193:6822/4147> pipe(0xbbae000 sd=92 :6818 s=0 pgs=0 cs=0 l=0 c=0x5a939c0).accept connect_seq 27 vs existing 27 state standby 2016-11-12 02:33:40.839052 7f973d933700 0 -- 192.168.92.12:6818/4289<http://192.168.92.12:6818/4289> >> 192.168.92.193:6822/4147<http://192.168.92.193:6822/4147> pipe(0xbbae000 sd=92 :6818 s=0 pgs=0 cs=0 l=0 c=0x5a939c0).accept connect_seq 28 vs existing 27 state standby 2016-11-12 02:34:00.187408 7f9719706700 0 -- 192.168.92.12:6818/4289<http://192.168.92.12:6818/4289> >> 192.168.92.193:6818/4140<http://192.168.92.193:6818/4140> pipe(0xc052000 sd=65 :6818 s=0 pgs=0 cs=0 l=0 c=0x5a91760).accept connect_seq 31 vs existing 31 state standby 2016-11-12 02:34:00.187686 7f9719706700 0 -- 192.168.92.12:6818/4289<http://192.168.92.12:6818/4289> >> 192.168.92.193:6818/4140<http://192.168.92.193:6818/4140> pipe(0xc052000 sd=65 :6818 s=0 pgs=0 cs=0 l=0 c=0x5a91760).accept connect_seq 32 vs existing 31 state standby At this point I'm stuck. I have no idea what to do do fix the 'volumes' pool. Does anybody have any suggestions? -- Joel JOEL GRIFFITHS LINUX SYSTEMS ENGINEER |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com