Cześć, > On Tue, 23 May 2017, Łukasz Chrustek wrote: >> Cześć, >> >> Hello, >> >> After terrible outage coused by failure of 10Gbit switch, ceph cluster >> went to HEALTH_ERR (three whole storage servers go offline in the same time >> and didn't back in short time). After cluster recovery two PGs goto to >> incomplite state, I can't them query, and can't do with them anything, > The thing where you can't query a PG is because the OSD is throttling > incoming work and the throttle is exhausted (the PG can't do work so it > isn't making progress). A workaround for jewel is to restart the OSD > serving the PG and do the query quickly after that (probably in a loop so > that you catch it after it starts up but before the throttle is > exhausted again). (In luminous this is fixed.) Thank You for claryfication. > Once you have the query output ('ceph tell $pgid query') you'll be able to > tell what is preventing the PG from peering. Hm.. what kind of loop You sugests ? When I do ceph tell $pgid query it hangs, not relasing to the console. > You can identify the osd(s) hosting the pg with 'ceph pg map $pgid'. it is somehting strange here for 1.165, how it is posible, that acting is 37 and it isn't in range of [84,38,48] ?: ceph pg map 1.165 osdmap e114855 pg 1.165 (1.165) -> up [84,38,48] acting [37] second one is ok, but also no ability to make pg query: [root@cc1 ~]# ceph pg map 1.60 osdmap e114855 pg 1.60 (1.60) -> up [66,84,40] acting [66,69,40] do I need to restart all three osds in the same time ? Can You advice how to unblock access to one of pool for this kind of command: [root@cc1 ~]# rbd ls volumes ^C strace for this is here: https://pastebin.com/hpbDg6gP - this time it hangs on some futex function. Are this cases (pg query hang and this rbd ls problem) are connected each other ? If I find solution for this, You will make my day (and night :) ). Regards Lukasz > HTH! > sage >> what would allow back working cluster back. here is strace of >> this command: https://pastebin.com/HpNFvR8Z. But... this cluster isn't enteriely off: >> >> [root@cc1 ~]# rbd ls management-vms >> os-mongodb1 >> os-mongodb1-database >> os-gitlab-root >> os-mongodb1-database2 >> os-wiki-root >> [root@cc1 ~]# rbd ls volumes >> ^C >> [root@cc1 ~]# >> >> and for all mon hosts (don't put all three here) >> >> [root@cc1 ~]# rbd -m 192.168.128.1 list management-vms >> os-mongodb1 >> os-mongodb1-database >> os-gitlab-root >> os-mongodb1-database2 >> os-wiki-root >> [root@cc1 ~]# rbd -m 192.168.128.1 list volumes >> ^C >> [root@cc1 ~]# >> >> and all other POOLs from list, except (most important) volumes, I can >> list images. >> >> Fanny thing, I can list rbd info for particular image: >> >> [root@cc1 ~]# rbd info >> volumes/volume-197602d7-40f9-40ad-b286-cdec688b1497 >> rbd image 'volume-197602d7-40f9-40ad-b286-cdec688b1497': >> size 20480 MB in 1280 objects >> order 24 (16384 kB objects) >> block_name_prefix: rbd_data.64a21a0a9acf52 >> format: 2 >> features: layering >> flags: >> parent: images/37bdf0ca-f1f3-46ce-95b9-c04bb9ac8a53@snap >> overlap: 3072 MB >> >> but can't list the whole content of pool volumes. >> >> [root@cc1 ~]# ceph osd pool ls >> volumes >> images >> backups >> volumes-ssd-intel-s3700 >> management-vms >> .rgw.root >> .rgw.control >> .rgw >> .rgw.gc >> .log >> .users.uid >> .rgw.buckets.index >> .users >> .rgw.buckets.extra >> .rgw.buckets >> volumes-cached >> cache-ssd >> >> here is ceph osd tree: >> >> ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY >> -7 20.88388 root ssd-intel-s3700 >> -11 3.19995 host ssd-stor1 >> 56 0.79999 osd.56 up 1.00000 1.00000 >> 57 0.79999 osd.57 up 1.00000 1.00000 >> 58 0.79999 osd.58 up 1.00000 1.00000 >> 59 0.79999 osd.59 up 1.00000 1.00000 >> -9 2.12999 host ssd-stor2 >> 60 0.70999 osd.60 up 1.00000 1.00000 >> 61 0.70999 osd.61 up 1.00000 1.00000 >> 62 0.70999 osd.62 up 1.00000 1.00000 >> -8 2.12999 host ssd-stor3 >> 63 0.70999 osd.63 up 1.00000 1.00000 >> 64 0.70999 osd.64 up 1.00000 1.00000 >> 65 0.70999 osd.65 up 1.00000 1.00000 >> -10 4.19998 host ssd-stor4 >> 25 0.70000 osd.25 up 1.00000 1.00000 >> 26 0.70000 osd.26 up 1.00000 1.00000 >> 27 0.70000 osd.27 up 1.00000 1.00000 >> 28 0.70000 osd.28 up 1.00000 1.00000 >> 29 0.70000 osd.29 up 1.00000 1.00000 >> 24 0.70000 osd.24 up 1.00000 1.00000 >> -12 3.41199 host ssd-stor5 >> 73 0.85300 osd.73 up 1.00000 1.00000 >> 74 0.85300 osd.74 up 1.00000 1.00000 >> 75 0.85300 osd.75 up 1.00000 1.00000 >> 76 0.85300 osd.76 up 1.00000 1.00000 >> -13 3.41199 host ssd-stor6 >> 77 0.85300 osd.77 up 1.00000 1.00000 >> 78 0.85300 osd.78 up 1.00000 1.00000 >> 79 0.85300 osd.79 up 1.00000 1.00000 >> 80 0.85300 osd.80 up 1.00000 1.00000 >> -15 2.39999 host ssd-stor7 >> 90 0.79999 osd.90 up 1.00000 1.00000 >> 91 0.79999 osd.91 up 1.00000 1.00000 >> 92 0.79999 osd.92 up 1.00000 1.00000 >> -1 167.69969 root default >> -2 33.99994 host stor1 >> 6 3.39999 osd.6 down 0 1.00000 >> 7 3.39999 osd.7 up 1.00000 1.00000 >> 8 3.39999 osd.8 up 1.00000 1.00000 >> 9 3.39999 osd.9 up 1.00000 1.00000 >> 10 3.39999 osd.10 down 0 1.00000 >> 11 3.39999 osd.11 down 0 1.00000 >> 69 3.39999 osd.69 up 1.00000 1.00000 >> 70 3.39999 osd.70 up 1.00000 1.00000 >> 71 3.39999 osd.71 down 0 1.00000 >> 81 3.39999 osd.81 up 1.00000 1.00000 >> -3 20.99991 host stor2 >> 13 2.09999 osd.13 up 1.00000 1.00000 >> 12 2.09999 osd.12 up 1.00000 1.00000 >> 14 2.09999 osd.14 up 1.00000 1.00000 >> 15 2.09999 osd.15 up 1.00000 1.00000 >> 16 2.09999 osd.16 up 1.00000 1.00000 >> 17 2.09999 osd.17 up 1.00000 1.00000 >> 18 2.09999 osd.18 down 0 1.00000 >> 19 2.09999 osd.19 up 1.00000 1.00000 >> 20 2.09999 osd.20 up 1.00000 1.00000 >> 21 2.09999 osd.21 up 1.00000 1.00000 >> -4 25.00000 host stor3 >> 30 2.50000 osd.30 up 1.00000 1.00000 >> 31 2.50000 osd.31 up 1.00000 1.00000 >> 32 2.50000 osd.32 up 1.00000 1.00000 >> 33 2.50000 osd.33 down 0 1.00000 >> 34 2.50000 osd.34 up 1.00000 1.00000 >> 35 2.50000 osd.35 up 1.00000 1.00000 >> 66 2.50000 osd.66 up 1.00000 1.00000 >> 67 2.50000 osd.67 up 1.00000 1.00000 >> 68 2.50000 osd.68 up 1.00000 1.00000 >> 72 2.50000 osd.72 down 0 1.00000 >> -5 25.00000 host stor4 >> 44 2.50000 osd.44 up 1.00000 1.00000 >> 45 2.50000 osd.45 up 1.00000 1.00000 >> 46 2.50000 osd.46 down 0 1.00000 >> 47 2.50000 osd.47 up 1.00000 1.00000 >> 0 2.50000 osd.0 up 1.00000 1.00000 >> 1 2.50000 osd.1 up 1.00000 1.00000 >> 2 2.50000 osd.2 up 1.00000 1.00000 >> 3 2.50000 osd.3 up 1.00000 1.00000 >> 4 2.50000 osd.4 up 1.00000 1.00000 >> 5 2.50000 osd.5 up 1.00000 1.00000 >> -6 14.19991 host stor5 >> 48 1.79999 osd.48 up 1.00000 1.00000 >> 49 1.59999 osd.49 up 1.00000 1.00000 >> 50 1.79999 osd.50 up 1.00000 1.00000 >> 51 1.79999 osd.51 down 0 1.00000 >> 52 1.79999 osd.52 up 1.00000 1.00000 >> 53 1.79999 osd.53 up 1.00000 1.00000 >> 54 1.79999 osd.54 up 1.00000 1.00000 >> 55 1.79999 osd.55 up 1.00000 1.00000 >> -14 14.39999 host stor6 >> 82 1.79999 osd.82 up 1.00000 1.00000 >> 83 1.79999 osd.83 up 1.00000 1.00000 >> 84 1.79999 osd.84 up 1.00000 1.00000 >> 85 1.79999 osd.85 up 1.00000 1.00000 >> 86 1.79999 osd.86 up 1.00000 1.00000 >> 87 1.79999 osd.87 up 1.00000 1.00000 >> 88 1.79999 osd.88 up 1.00000 1.00000 >> 89 1.79999 osd.89 up 1.00000 1.00000 >> -16 12.59999 host stor7 >> 93 1.79999 osd.93 up 1.00000 1.00000 >> 94 1.79999 osd.94 up 1.00000 1.00000 >> 95 1.79999 osd.95 up 1.00000 1.00000 >> 96 1.79999 osd.96 up 1.00000 1.00000 >> 97 1.79999 osd.97 up 1.00000 1.00000 >> 98 1.79999 osd.98 up 1.00000 1.00000 >> 99 1.79999 osd.99 up 1.00000 1.00000 >> -17 21.49995 host stor8 >> 22 1.59999 osd.22 up 1.00000 1.00000 >> 23 1.59999 osd.23 up 1.00000 1.00000 >> 36 2.09999 osd.36 up 1.00000 1.00000 >> 37 2.09999 osd.37 up 1.00000 1.00000 >> 38 2.50000 osd.38 up 1.00000 1.00000 >> 39 2.50000 osd.39 up 1.00000 1.00000 >> 40 2.50000 osd.40 up 1.00000 1.00000 >> 41 2.50000 osd.41 down 0 1.00000 >> 42 2.50000 osd.42 up 1.00000 1.00000 >> 43 1.59999 osd.43 up 1.00000 1.00000 >> [root@cc1 ~]# >> >> and ceph health detail: >> >> ceph health detail | grep down >> HEALTH_WARN 23 pgs backfilling; 23 pgs degraded; 2 pgs down; 2 pgs >> peering; 2 pgs stuck inactive; 25 pgs stuck unclean; 23 pgs >> undersized; recovery 176211/14148564 objects degraded (1.245%); >> recovery 238972/14148564 objects misplaced (1.689%); noout flag(s) set >> pg 1.60 is stuck inactive since forever, current state >> down+remapped+peering, last acting [66,69,40] >> pg 1.165 is stuck inactive since forever, current state >> down+remapped+peering, last acting [37] >> pg 1.60 is stuck unclean since forever, current state >> down+remapped+peering, last acting [66,69,40] >> pg 1.165 is stuck unclean since forever, current state >> down+remapped+peering, last acting [37] >> pg 1.165 is down+remapped+peering, acting [37] >> pg 1.60 is down+remapped+peering, acting [66,69,40] >> >> >> problematic pgs are 1.165 and 1.60. >> >> Please advice how to unblock pool volumes and/or make this two pgs >> working - in a last night and day, when we tried to solve this issue >> these pgs are for 100% empty from data. >> >> >> >> >> -- >> Pozdrowienia, >> Łukasz Chrustek >> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- Pozdrowienia, Łukasz Chrustek -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html