On Tue, 23 May 2017, Łukasz Chrustek wrote: > I'm not sleeping for over 30 hours, and still can't find solution. I > did, as You wrote, but turning off this > (https://pastebin.com/1npBXeMV) osds didn't resolve issue... The important bit is: "blocked": "peering is blocked due to down osds", "down_osds_we_would_probe": [ 6, 10, 33, 37, 72 ], "peering_blocked_by": [ { "osd": 6, "current_lost_at": 0, "comment": "starting or marking this osd lost may let us proceed" }, { "osd": 10, "current_lost_at": 0, "comment": "starting or marking this osd lost may let us proceed" }, { "osd": 37, "current_lost_at": 0, "comment": "starting or marking this osd lost may let us proceed" }, { "osd": 72, "current_lost_at": 113771, "comment": "starting or marking this osd lost may let us proceed" } ] }, Are any of those OSDs startable? sage > > Regards > Lukasz Chrustek > > > > On Tue, 23 May 2017, Łukasz Chrustek wrote: > >> Cześć, > >> > >> Hello, > >> > >> After terrible outage coused by failure of 10Gbit switch, ceph cluster > >> went to HEALTH_ERR (three whole storage servers go offline in the same time > >> and didn't back in short time). After cluster recovery two PGs goto to > >> incomplite state, I can't them query, and can't do with them anything, > > > The thing where you can't query a PG is because the OSD is throttling > > incoming work and the throttle is exhausted (the PG can't do work so it > > isn't making progress). A workaround for jewel is to restart the OSD > > serving the PG and do the query quickly after that (probably in a loop so > > that you catch it after it starts up but before the throttle is > > exhausted again). (In luminous this is fixed.) > > > Once you have the query output ('ceph tell $pgid query') you'll be able to > > tell what is preventing the PG from peering. > > > You can identify the osd(s) hosting the pg with 'ceph pg map $pgid'. > > > HTH! > > sage > > > >> what would allow back working cluster back. here is strace of > >> this command: https://pastebin.com/HpNFvR8Z. But... this cluster isn't enteriely off: > >> > >> [root@cc1 ~]# rbd ls management-vms > >> os-mongodb1 > >> os-mongodb1-database > >> os-gitlab-root > >> os-mongodb1-database2 > >> os-wiki-root > >> [root@cc1 ~]# rbd ls volumes > >> ^C > >> [root@cc1 ~]# > >> > >> and for all mon hosts (don't put all three here) > >> > >> [root@cc1 ~]# rbd -m 192.168.128.1 list management-vms > >> os-mongodb1 > >> os-mongodb1-database > >> os-gitlab-root > >> os-mongodb1-database2 > >> os-wiki-root > >> [root@cc1 ~]# rbd -m 192.168.128.1 list volumes > >> ^C > >> [root@cc1 ~]# > >> > >> and all other POOLs from list, except (most important) volumes, I can > >> list images. > >> > >> Fanny thing, I can list rbd info for particular image: > >> > >> [root@cc1 ~]# rbd info > >> volumes/volume-197602d7-40f9-40ad-b286-cdec688b1497 > >> rbd image 'volume-197602d7-40f9-40ad-b286-cdec688b1497': > >> size 20480 MB in 1280 objects > >> order 24 (16384 kB objects) > >> block_name_prefix: rbd_data.64a21a0a9acf52 > >> format: 2 > >> features: layering > >> flags: > >> parent: images/37bdf0ca-f1f3-46ce-95b9-c04bb9ac8a53@snap > >> overlap: 3072 MB > >> > >> but can't list the whole content of pool volumes. > >> > >> [root@cc1 ~]# ceph osd pool ls > >> volumes > >> images > >> backups > >> volumes-ssd-intel-s3700 > >> management-vms > >> .rgw.root > >> .rgw.control > >> .rgw > >> .rgw.gc > >> .log > >> .users.uid > >> .rgw.buckets.index > >> .users > >> .rgw.buckets.extra > >> .rgw.buckets > >> volumes-cached > >> cache-ssd > >> > >> here is ceph osd tree: > >> > >> ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > >> -7 20.88388 root ssd-intel-s3700 > >> -11 3.19995 host ssd-stor1 > >> 56 0.79999 osd.56 up 1.00000 1.00000 > >> 57 0.79999 osd.57 up 1.00000 1.00000 > >> 58 0.79999 osd.58 up 1.00000 1.00000 > >> 59 0.79999 osd.59 up 1.00000 1.00000 > >> -9 2.12999 host ssd-stor2 > >> 60 0.70999 osd.60 up 1.00000 1.00000 > >> 61 0.70999 osd.61 up 1.00000 1.00000 > >> 62 0.70999 osd.62 up 1.00000 1.00000 > >> -8 2.12999 host ssd-stor3 > >> 63 0.70999 osd.63 up 1.00000 1.00000 > >> 64 0.70999 osd.64 up 1.00000 1.00000 > >> 65 0.70999 osd.65 up 1.00000 1.00000 > >> -10 4.19998 host ssd-stor4 > >> 25 0.70000 osd.25 up 1.00000 1.00000 > >> 26 0.70000 osd.26 up 1.00000 1.00000 > >> 27 0.70000 osd.27 up 1.00000 1.00000 > >> 28 0.70000 osd.28 up 1.00000 1.00000 > >> 29 0.70000 osd.29 up 1.00000 1.00000 > >> 24 0.70000 osd.24 up 1.00000 1.00000 > >> -12 3.41199 host ssd-stor5 > >> 73 0.85300 osd.73 up 1.00000 1.00000 > >> 74 0.85300 osd.74 up 1.00000 1.00000 > >> 75 0.85300 osd.75 up 1.00000 1.00000 > >> 76 0.85300 osd.76 up 1.00000 1.00000 > >> -13 3.41199 host ssd-stor6 > >> 77 0.85300 osd.77 up 1.00000 1.00000 > >> 78 0.85300 osd.78 up 1.00000 1.00000 > >> 79 0.85300 osd.79 up 1.00000 1.00000 > >> 80 0.85300 osd.80 up 1.00000 1.00000 > >> -15 2.39999 host ssd-stor7 > >> 90 0.79999 osd.90 up 1.00000 1.00000 > >> 91 0.79999 osd.91 up 1.00000 1.00000 > >> 92 0.79999 osd.92 up 1.00000 1.00000 > >> -1 167.69969 root default > >> -2 33.99994 host stor1 > >> 6 3.39999 osd.6 down 0 1.00000 > >> 7 3.39999 osd.7 up 1.00000 1.00000 > >> 8 3.39999 osd.8 up 1.00000 1.00000 > >> 9 3.39999 osd.9 up 1.00000 1.00000 > >> 10 3.39999 osd.10 down 0 1.00000 > >> 11 3.39999 osd.11 down 0 1.00000 > >> 69 3.39999 osd.69 up 1.00000 1.00000 > >> 70 3.39999 osd.70 up 1.00000 1.00000 > >> 71 3.39999 osd.71 down 0 1.00000 > >> 81 3.39999 osd.81 up 1.00000 1.00000 > >> -3 20.99991 host stor2 > >> 13 2.09999 osd.13 up 1.00000 1.00000 > >> 12 2.09999 osd.12 up 1.00000 1.00000 > >> 14 2.09999 osd.14 up 1.00000 1.00000 > >> 15 2.09999 osd.15 up 1.00000 1.00000 > >> 16 2.09999 osd.16 up 1.00000 1.00000 > >> 17 2.09999 osd.17 up 1.00000 1.00000 > >> 18 2.09999 osd.18 down 0 1.00000 > >> 19 2.09999 osd.19 up 1.00000 1.00000 > >> 20 2.09999 osd.20 up 1.00000 1.00000 > >> 21 2.09999 osd.21 up 1.00000 1.00000 > >> -4 25.00000 host stor3 > >> 30 2.50000 osd.30 up 1.00000 1.00000 > >> 31 2.50000 osd.31 up 1.00000 1.00000 > >> 32 2.50000 osd.32 up 1.00000 1.00000 > >> 33 2.50000 osd.33 down 0 1.00000 > >> 34 2.50000 osd.34 up 1.00000 1.00000 > >> 35 2.50000 osd.35 up 1.00000 1.00000 > >> 66 2.50000 osd.66 up 1.00000 1.00000 > >> 67 2.50000 osd.67 up 1.00000 1.00000 > >> 68 2.50000 osd.68 up 1.00000 1.00000 > >> 72 2.50000 osd.72 down 0 1.00000 > >> -5 25.00000 host stor4 > >> 44 2.50000 osd.44 up 1.00000 1.00000 > >> 45 2.50000 osd.45 up 1.00000 1.00000 > >> 46 2.50000 osd.46 down 0 1.00000 > >> 47 2.50000 osd.47 up 1.00000 1.00000 > >> 0 2.50000 osd.0 up 1.00000 1.00000 > >> 1 2.50000 osd.1 up 1.00000 1.00000 > >> 2 2.50000 osd.2 up 1.00000 1.00000 > >> 3 2.50000 osd.3 up 1.00000 1.00000 > >> 4 2.50000 osd.4 up 1.00000 1.00000 > >> 5 2.50000 osd.5 up 1.00000 1.00000 > >> -6 14.19991 host stor5 > >> 48 1.79999 osd.48 up 1.00000 1.00000 > >> 49 1.59999 osd.49 up 1.00000 1.00000 > >> 50 1.79999 osd.50 up 1.00000 1.00000 > >> 51 1.79999 osd.51 down 0 1.00000 > >> 52 1.79999 osd.52 up 1.00000 1.00000 > >> 53 1.79999 osd.53 up 1.00000 1.00000 > >> 54 1.79999 osd.54 up 1.00000 1.00000 > >> 55 1.79999 osd.55 up 1.00000 1.00000 > >> -14 14.39999 host stor6 > >> 82 1.79999 osd.82 up 1.00000 1.00000 > >> 83 1.79999 osd.83 up 1.00000 1.00000 > >> 84 1.79999 osd.84 up 1.00000 1.00000 > >> 85 1.79999 osd.85 up 1.00000 1.00000 > >> 86 1.79999 osd.86 up 1.00000 1.00000 > >> 87 1.79999 osd.87 up 1.00000 1.00000 > >> 88 1.79999 osd.88 up 1.00000 1.00000 > >> 89 1.79999 osd.89 up 1.00000 1.00000 > >> -16 12.59999 host stor7 > >> 93 1.79999 osd.93 up 1.00000 1.00000 > >> 94 1.79999 osd.94 up 1.00000 1.00000 > >> 95 1.79999 osd.95 up 1.00000 1.00000 > >> 96 1.79999 osd.96 up 1.00000 1.00000 > >> 97 1.79999 osd.97 up 1.00000 1.00000 > >> 98 1.79999 osd.98 up 1.00000 1.00000 > >> 99 1.79999 osd.99 up 1.00000 1.00000 > >> -17 21.49995 host stor8 > >> 22 1.59999 osd.22 up 1.00000 1.00000 > >> 23 1.59999 osd.23 up 1.00000 1.00000 > >> 36 2.09999 osd.36 up 1.00000 1.00000 > >> 37 2.09999 osd.37 up 1.00000 1.00000 > >> 38 2.50000 osd.38 up 1.00000 1.00000 > >> 39 2.50000 osd.39 up 1.00000 1.00000 > >> 40 2.50000 osd.40 up 1.00000 1.00000 > >> 41 2.50000 osd.41 down 0 1.00000 > >> 42 2.50000 osd.42 up 1.00000 1.00000 > >> 43 1.59999 osd.43 up 1.00000 1.00000 > >> [root@cc1 ~]# > >> > >> and ceph health detail: > >> > >> ceph health detail | grep down > >> HEALTH_WARN 23 pgs backfilling; 23 pgs degraded; 2 pgs down; 2 pgs > >> peering; 2 pgs stuck inactive; 25 pgs stuck unclean; 23 pgs > >> undersized; recovery 176211/14148564 objects degraded (1.245%); > >> recovery 238972/14148564 objects misplaced (1.689%); noout flag(s) set > >> pg 1.60 is stuck inactive since forever, current state > >> down+remapped+peering, last acting [66,69,40] > >> pg 1.165 is stuck inactive since forever, current state > >> down+remapped+peering, last acting [37] > >> pg 1.60 is stuck unclean since forever, current state > >> down+remapped+peering, last acting [66,69,40] > >> pg 1.165 is stuck unclean since forever, current state > >> down+remapped+peering, last acting [37] > >> pg 1.165 is down+remapped+peering, acting [37] > >> pg 1.60 is down+remapped+peering, acting [66,69,40] > >> > >> > >> problematic pgs are 1.165 and 1.60. > >> > >> Please advice how to unblock pool volumes and/or make this two pgs > >> working - in a last night and day, when we tried to solve this issue > >> these pgs are for 100% empty from data. > >> > >> > >> > >> > >> -- > >> Pozdrowienia, > >> Łukasz Chrustek > >> > >> -- > >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > >> the body of a message to majordomo@xxxxxxxxxxxxxxx > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> > >> > > > > -- > Pozdrowienia, > Łukasz Chrustek > >