Re: Inconsistent PGs

Paweł Sadowski <ceph@xxxxxxxxx> · Wed, 22 Jun 2016 12:30:04 +0200

Query on that PGs hangs forever. We ended up using
*ceph-objectstore-tool**mark-complete* on those PGs.

On 06/22/2016 11:45 AM, 施柏安 wrote:
> Hi,
> You can use command 'ceph pg query' to check what's going on with the
> pgs which have problem and use "ceph-objectstore-tool" to recover that pg.
>
> 2016-06-21 19:09 GMT+08:00 Paweł Sadowski <ceph@xxxxxxxxx
> <mailto:ceph@xxxxxxxxx>>:
>
>     Already restarted those OSD and then whole cluster (rack by rack,
>     failure domain is rack in this setup).
>     We would like to try *ceph-objectstore-tool mark-complete*
>     operation. Is
>     there any way (other than checking mtime on file and querying PGs) to
>     determine which replica has most up to date datas?
>
>     On 06/21/2016 12:37 PM, M Ranga Swami Reddy wrote:
>     > Try to restart OSD 109 and 166? check if it help?
>     >
>     >
>     > On Tue, Jun 21, 2016 at 4:05 PM, Paweł Sadowski <ceph@xxxxxxxxx
>     <mailto:ceph@xxxxxxxxx>> wrote:
>     >> Thanks for response.
>     >>
>     >> All OSDs seems to be ok, they have been restarted, joined
>     cluster after
>     >> that, nothing weird in the logs.
>     >>
>     >> # ceph pg dump_stuck stale
>     >> ok
>     >>
>     >> # ceph pg dump_stuck inactive
>     >> ok
>     >> pg_stat    state    up    up_primary    acting    acting_primary
>     >> 3.2929    incomplete    [109,272,83]    109    [109,272,83]    109
>     >> 3.1683    incomplete    [166,329,281]    166    [166,329,281] 
>       166
>     >>
>     >> # ceph pg dump_stuck unclean
>     >> ok
>     >> pg_stat    state    up    up_primary    acting    acting_primary
>     >> 3.2929    incomplete    [109,272,83]    109    [109,272,83]    109
>     >> 3.1683    incomplete    [166,329,281]    166    [166,329,281] 
>       166
>     >>
>     >>
>     >> On OSD 166 there is 100 blocked ops (on 109 too), they all end on
>     >> "event": "reached_pg"
>     >>
>     >> # ceph --admin-daemon /var/run/ceph/ceph-osd.166.asok
>     dump_ops_in_flight
>     >> ...
>     >>         {
>     >>             "description": "osd_op(client.958764031.0:18137113
>     >> rbd_data.392585982ae8944a.0000000000000ad4 [set-alloc-hint
>     object_size
>     >> 4194304 write_size 4194304,write 2641920~8192] 3.d6195683 RETRY=15
>     >> ack+ondisk+retry+write+known_if_redirected e613241)",
>     >>             "initiated_at": "2016-06-21 10:19:59.894393",
>     >>             "age": 828.025527,
>     >>             "duration": 600.020809,
>     >>             "type_data": [
>     >>                 "reached pg",
>     >>                 {
>     >>                     "client": "client.958764031",
>     >>                     "tid": 18137113
>     >>                 },
>     >>                 [
>     >>                     {
>     >>                         "time": "2016-06-21 10:19:59.894393",
>     >>                         "event": "initiated"
>     >>                     },
>     >>                     {
>     >>                         "time": "2016-06-21 10:29:59.915202",
>     >>                         "event": "reached_pg"
>     >>                     }
>     >>                 ]
>     >>             ]
>     >>         }
>     >>     ],
>     >>     "num_ops": 100
>     >> }
>     >>
>     >>
>     >>
>     >> On 06/21/2016 12:27 PM, M Ranga Swami Reddy wrote:
>     >>> you can use the below cmds:
>     >>> ==
>     >>>
>     >>> ceph pg dump_stuck stale
>     >>> ceph pg dump_stuck inactive
>     >>> ceph pg dump_stuck unclean
>     >>> ===
>     >>>
>     >>> And the query the PG, which are in unclean or stale state,
>     check for
>     >>> any issue with a specific OSD.
>     >>>
>     >>> Thanks
>     >>> Swami
>     >>>
>     >>> On Tue, Jun 21, 2016 at 3:02 PM, Paweł Sadowski
>     <ceph@xxxxxxxxx <mailto:ceph@xxxxxxxxx>> wrote:
>     >>>> Hello,
>     >>>>
>     >>>> We have an issue on one of our clusters. One node with 9 OSD
>     was down
>     >>>> for more than 12 hours. During that time cluster recovered
>     without
>     >>>> problems. When host back to the cluster we got two PGs in
>     incomplete
>     >>>> state. We decided to mark OSDs on this host as out but the
>     two PGs are
>     >>>> still in incomplete state. Trying to query those pg hangs
>     forever. We
>     >>>> were alredy trying restarting OSDs. Is there any way to solve
>     this issue
>     >>>> without loosing data? Any help appreciate :)
>     >>>>
>     >>>> # ceph health detail | grep incomplete
>     >>>> HEALTH_WARN 2 pgs incomplete; 2 pgs stuck inactive; 2 pgs
>     stuck unclean;
>     >>>> 200 requests are blocked > 32 sec; 2 osds have slow requests;
>     >>>> noscrub,nodeep-scrub flag(s) set
>     >>>> pg 3.2929 is stuck inactive since forever, current state
>     incomplete,
>     >>>> last acting [109,272,83]
>     >>>> pg 3.1683 is stuck inactive since forever, current state
>     incomplete,
>     >>>> last acting [166,329,281]
>     >>>> pg 3.2929 is stuck unclean since forever, current state
>     incomplete, last
>     >>>> acting [109,272,83]
>     >>>> pg 3.1683 is stuck unclean since forever, current state
>     incomplete, last
>     >>>> acting [166,329,281]
>     >>>> pg 3.1683 is incomplete, acting [166,329,281] (reducing pool vms
>     >>>> min_size from 2 may help; search ceph.com/docs
>     <http://ceph.com/docs> for 'incomplete')
>     >>>> pg 3.2929 is incomplete, acting [109,272,83] (reducing pool
>     vms min_size
>     >>>> from 2 may help; search ceph.com/docs <http://ceph.com/docs>
>     for 'incomplete')
>     >>>>
>     >>>> Directory for PG 3.1683 is present on OSD 166 and containes ~8GB.
>     >>>>
>     >>>> We didn't try setting min_size to 1 yet (we treat is as a
>     last resort).
>     >>>>
>     >>>>
>     >>>>
>     >>>> Some cluster info:
>     >>>> # ceph --version
>     >>>>
>     >>>> ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
>     >>>>
>     >>>> # ceph -s
>     >>>>      health HEALTH_WARN
>     >>>>             2 pgs incomplete
>     >>>>             2 pgs stuck inactive
>     >>>>             2 pgs stuck unclean
>     >>>>             200 requests are blocked > 32 sec
>     >>>>             noscrub,nodeep-scrub flag(s) set
>     >>>>      monmap e7: 5 mons at
>     >>>>
>     {mon-03=*.2:6789/0,mon-04=*.36:6789/0,mon-05=*.81:6789/0,mon-06=*.0:6789/0,mon-07=*.40:6789/0}
>     >>>>             election epoch 3250, quorum 0,1,2,3,4
>     >>>> mon-06,mon-07,mon-04,mon-03,mon-05
>     >>>>      osdmap e613040: 346 osds: 346 up, 337 in
>     >>>>             flags noscrub,nodeep-scrub
>     >>>>       pgmap v27163053: 18624 pgs, 6 pools, 138 TB data, 39062
>     kobjects
>     >>>>             415 TB used, 186 TB / 601 TB avail
>     >>>>                18622 active+clean
>     >>>>                    2 incomplete
>     >>>>   client io 9992 kB/s rd, 64867 kB/s wr, 8458 op/s
>     >>>>
>     >>>>
>     >>>> # ceph osd pool get vms pg_num
>     >>>> pg_num: 16384
>     >>>>
>     >>>> # ceph osd pool get vms size
>     >>>> size: 3
>     >>>>
>     >>>> # ceph osd pool get vms min_size
>     >>>> min_size: 2
>

-- 
PS
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com