Re: incomplete pgs - cannot clear

Wyllys Ingersoll <wyllys.ingersoll@xxxxxxxxxxxxxx> · Thu, 14 Jun 2018 11:51:25 -0400

Yes, I did have the ignore_history_les_option set for 2 of the running
osds, but I disabled and restarted the affected osds and this is where
it ends up:

            "probing_osds": [
                "20",
                "23",
                "30",
                "52"
            ],
            "down_osds_we_would_probe": [
                65,
                100,
                101,
                107
            ],
            "peering_blocked_by": [],
            "peering_blocked_by_detail": [
                {
                    "detail": "peering_blocked_by_history_les_bound"
                }
            ]

The 'down_osds_we_would_probe' are all non-existent.  This is where I
started the day, still cant get past it.  And this is seen on all of
the incomplete pgs, this is just 1 example.

On Thu, Jun 14, 2018 at 11:47 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> On Thu, 14 Jun 2018, Wyllys Ingersoll wrote:
>> I set nobackfill and here is out put of query for 1 of the incomplete pgs:
>>
>> $ ceph pg 1.10e query
>> {
>>     "state": "remapped",
> ...
>
>>     "snap_trimq": "[]",
>>     "snap_trimq_len": 0,
>>     "epoch": 465256,
>>     "up": [
>>         52,
>>         23,
>>         20
>>     ],
>>     "acting": [
>>         20
>>     ],
> ...
>>     "recovery_state": [
>>         {
>>             "name": "Started/Primary/Peering/WaitActingChange",
>>             "enter_time": "2018-06-14 11:38:54.482696",
>>             "comment": "waiting for pg acting set to change"
>
> You are probably in a pg_temp loop where it is cycling between two
> different acting sets.  Do you have the ignore_history_les option set
> on some nodes?  Make sure it is off everywhere.  And maybe repeat teh
> query a few times to see if you can catch the other acting.  If it keeps
> cycling you'll need to catpure peering logs from the primary OSD to see
> what is going on exactly.
>
> sage
>
>
>
>>         },
>>         {
>>             "name": "Started",
>>             "enter_time": "2018-06-14 11:38:54.471136"
>>         }
>>     ],
>>     "agent_state": {}
>> }
>>
>> On Thu, Jun 14, 2018 at 11:36 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
>> > On Thu, 14 Jun 2018, Wyllys Ingersoll wrote:
>> >> Ceph Luminous 12.2.5 with filestore OSDs
>> >>
>> >> I have a cluster that had a bunch of disks removed due to failures and
>> >> hardware problems.  At this point, after about a few day of
>> >> rebalancing and attempting to get healthy, it still has 16 incomplete
>> >> pgs that I cannot seem to get fixed.
>> >
>> > Rebalancing generally won't help peering; it's often easiest to tell
>> > what's going on if you temporarily set nobackfill and just focus on
>> > getting all of the PGs peered and/or active.
>> >
>> >> I've tried moving some of the pgs to other osds using the
>> >> ceph-objecstore-tool.  I've restarted some of the osds.  Ive tried all
>> >> of the tricks I could find online for clearing these issues but they
>> >> persist.
>> >>
>> >> One problem appears to be that a lot of the osds are stuck or blocked
>> >> waiting for osds that no longer exist in the crush map.  'ceph osd
>> >> blocked-by' shows many osds that are not in the cluster anymore.  Is
>> >> there anyway to force the osds that are stuck waiting for non-existent
>> >> osds to move on and drop them from their list ?  Even restarting them
>> >> does not fix the issue.  Is it a bug that osds are blocking on
>> >> non-existent osds?
>> >>
>> >>
>> >> OBJECT_MISPLACED 610200/41085645 objects misplaced (1.485%)
>> >> PG_AVAILABILITY Reduced data availability: 16 pgs inactive, 3 pgs
>> >> peering, 13 pgs incomplete
>> >
>> > The incomplete or peering PGs are the ones to focus on.  Can you attach
>> > the result of a 'ceph tell <pgid> query'?
>> >
>> > sage
>> >
>> >
>> >>     pg 1.10e is stuck peering since forever, current state peering,
>> >> last acting [52,23,20]
>> >>     pg 1.12a is incomplete, acting [27,63,53] (reducing pool
>> >> cephfs_data min_size from 2 may help; search ceph.com/docs for
>> >> 'incomplete')
>> >>     pg 1.20b is incomplete, acting [84,59,18] (reducing pool
>> >> cephfs_data min_size from 2 may help; search ceph.com/docs for
>> >> 'incomplete')
>> >>     pg 1.24f is incomplete, acting [13,23,19] (reducing pool
>> >> cephfs_data min_size from 2 may help; search ceph.com/docs for
>> >> 'incomplete')
>> >>     pg 1.25c is incomplete, acting [23,52,60] (reducing pool
>> >> cephfs_data min_size from 2 may help; search ceph.com/docs for
>> >> 'incomplete')
>> >>     pg 1.2bd is incomplete, acting [59,53,19] (reducing pool
>> >> cephfs_data min_size from 2 may help; search ceph.com/docs for
>> >> 'incomplete')
>> >>     pg 1.2e4 is incomplete, acting [67,22,6] (reducing pool
>> >> cephfs_data min_size from 2 may help; search ceph.com/docs for
>> >> 'incomplete')
>> >>     pg 1.2fd is stuck peering since forever, current state peering,
>> >> last acting [79,53,58]
>> >>     pg 1.390 is incomplete, acting [81,18,2] (reducing pool
>> >> cephfs_data min_size from 2 may help; search ceph.com/docs for
>> >> 'incomplete')
>> >>     pg 1.482 is incomplete, acting [1,53,90] (reducing pool
>> >> cephfs_data min_size from 2 may help; search ceph.com/docs for
>> >> 'incomplete')
>> >>     pg 1.504 is incomplete, acting [59,96,53] (reducing pool
>> >> cephfs_data min_size from 2 may help; search ceph.com/docs for
>> >> 'incomplete')
>> >>     pg 1.688 is incomplete, acting [36,53,49] (reducing pool
>> >> cephfs_data min_size from 2 may help; search ceph.com/docs for
>> >> 'incomplete')
>> >>     pg 1.6dd is incomplete, acting [47,56,12] (reducing pool
>> >> cephfs_data min_size from 2 may help; search ceph.com/docs for
>> >> 'incomplete')
>> >>     pg 1.703 is incomplete, acting [47,2,51] (reducing pool
>> >> cephfs_data min_size from 2 may help; search ceph.com/docs for
>> >> 'incomplete')
>> >>     pg 1.7a2 is stuck peering since forever, current state peering,
>> >> last acting [18,82,3]
>> >>     pg 1.7b4 is incomplete, acting [92,49,96] (reducing pool
>> >> cephfs_data min_size from 2 may help; search ceph.com/docs for
>> >> 'incomplete')
>> >> PG_DEGRADED Degraded data redundancy: 13439/41085645 objects degraded
>> >> (0.033%), 2 pgs degraded, 2 pgs undersized
>> >>     pg 1.74 is stuck undersized for 620.126459, current state
>> >> active+undersized+degraded+remapped+backfill_wait, last acting [17,6]
>> >>     pg 1.527 is stuck undersized for 712.173611, current state
>> >> active+undersized+degraded+remapped+backfill_wait, last acting [63,86]
>> >> REQUEST_SLOW 2 slow requests are blocked > 32 sec
>> >>     2 ops are blocked > 2097.15 sec
>> >>     osd.18 has blocked requests > 2097.15 sec
>> >> REQUEST_STUCK 63 stuck requests are blocked > 4096 sec
>> >>     22 ops are blocked > 134218 sec
>> >>     2 ops are blocked > 67108.9 sec
>> >>     28 ops are blocked > 8388.61 sec
>> >>     11 ops are blocked > 4194.3 sec
>> >>     osds 23,92 have stuck requests > 4194.3 sec
>> >>     osds 59,81 have stuck requests > 8388.61 sec
>> >>     osd.13 has stuck requests > 67108.9 sec
>> >>     osds 1,36,47,67,84 have stuck requests > 134218 sec
>> >>
>> >>
>> >> Any help would be much appreciated.
>> >>
>> >> Wyllys Ingersoll
>> >> Keeper Technology, LLC
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >>
>> >>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html