you can use the below cmds: == ceph pg dump_stuck stale ceph pg dump_stuck inactive ceph pg dump_stuck unclean === And the query the PG, which are in unclean or stale state, check for any issue with a specific OSD. Thanks Swami On Tue, Jun 21, 2016 at 3:02 PM, Paweł Sadowski <ceph@xxxxxxxxx> wrote: > Hello, > > We have an issue on one of our clusters. One node with 9 OSD was down > for more than 12 hours. During that time cluster recovered without > problems. When host back to the cluster we got two PGs in incomplete > state. We decided to mark OSDs on this host as out but the two PGs are > still in incomplete state. Trying to query those pg hangs forever. We > were alredy trying restarting OSDs. Is there any way to solve this issue > without loosing data? Any help appreciate :) > > # ceph health detail | grep incomplete > HEALTH_WARN 2 pgs incomplete; 2 pgs stuck inactive; 2 pgs stuck unclean; > 200 requests are blocked > 32 sec; 2 osds have slow requests; > noscrub,nodeep-scrub flag(s) set > pg 3.2929 is stuck inactive since forever, current state incomplete, > last acting [109,272,83] > pg 3.1683 is stuck inactive since forever, current state incomplete, > last acting [166,329,281] > pg 3.2929 is stuck unclean since forever, current state incomplete, last > acting [109,272,83] > pg 3.1683 is stuck unclean since forever, current state incomplete, last > acting [166,329,281] > pg 3.1683 is incomplete, acting [166,329,281] (reducing pool vms > min_size from 2 may help; search ceph.com/docs for 'incomplete') > pg 3.2929 is incomplete, acting [109,272,83] (reducing pool vms min_size > from 2 may help; search ceph.com/docs for 'incomplete') > > Directory for PG 3.1683 is present on OSD 166 and containes ~8GB. > > We didn't try setting min_size to 1 yet (we treat is as a last resort). > > > > Some cluster info: > # ceph --version > > ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403) > > # ceph -s > health HEALTH_WARN > 2 pgs incomplete > 2 pgs stuck inactive > 2 pgs stuck unclean > 200 requests are blocked > 32 sec > noscrub,nodeep-scrub flag(s) set > monmap e7: 5 mons at > {mon-03=*.2:6789/0,mon-04=*.36:6789/0,mon-05=*.81:6789/0,mon-06=*.0:6789/0,mon-07=*.40:6789/0} > election epoch 3250, quorum 0,1,2,3,4 > mon-06,mon-07,mon-04,mon-03,mon-05 > osdmap e613040: 346 osds: 346 up, 337 in > flags noscrub,nodeep-scrub > pgmap v27163053: 18624 pgs, 6 pools, 138 TB data, 39062 kobjects > 415 TB used, 186 TB / 601 TB avail > 18622 active+clean > 2 incomplete > client io 9992 kB/s rd, 64867 kB/s wr, 8458 op/s > > > # ceph osd pool get vms pg_num > pg_num: 16384 > > # ceph osd pool get vms size > size: 3 > > # ceph osd pool get vms min_size > min_size: 2 > > > -- > PS > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com