Re: pgs stuck inactive

Brad Hubbard <bhubbard@xxxxxxxxxx> · Fri, 10 Mar 2017 20:19:58 +1000

To me it looks like someone may have done an "rm" on these OSDs but
not removed them from the crushmap. This does not happen
automatically.

Do these OSDs show up in "ceph osd tree" and "ceph osd dump" ? If so,
paste the output.

Without knowing what exactly happened here it may be difficult to work
out how to proceed.

In order to go clean the primary needs to communicate with multiple
OSDs, some of which are marked DNE and seem to be uncontactable.

This seems to be more than a network issue (unless the outage is still
happening).

http://docs.ceph.com/docs/master/rados/operations/pg-states/?highlight=incomplete

On Fri, Mar 10, 2017 at 6:09 PM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx> wrote:
> Hello,
>
> I was informed that due to a networking issue the ceph cluster network was
> affected. There was a huge packet loss, and network interfaces were
> flipping. That's all I got.
> This outage has lasted a longer period of time. So I assume that some OSD
> may have been considered dead and the data from them has been moved away to
> other PGs (this is what ceph is supposed to do if I'm correct). Probably
> that was the point when the listed PGs have appeared into the picture.
> From the query we can see this for one of those OSDs:
>         {
>             "peer": "14",
>             "pgid": "3.367",
>             "last_update": "0'0",
>             "last_complete": "0'0",
>             "log_tail": "0'0",
>             "last_user_version": 0,
>             "last_backfill": "MAX",
>             "purged_snaps": "[]",
>             "history": {
>                 "epoch_created": 4,
>                 "last_epoch_started": 54899,
>                 "last_epoch_clean": 55143,
>                 "last_epoch_split": 0,
>                 "same_up_since": 60603,
>                 "same_interval_since": 60603,
>                 "same_primary_since": 60593,
>                 "last_scrub": "2852'33528",
>                 "last_scrub_stamp": "2017-02-26 02:36:55.210150",
>                 "last_deep_scrub": "2852'16480",
>                 "last_deep_scrub_stamp": "2017-02-21 00:14:08.866448",
>                 "last_clean_scrub_stamp": "2017-02-26 02:36:55.210150"
>             },
>             "stats": {
>                 "version": "0'0",
>                 "reported_seq": "14",
>                 "reported_epoch": "59779",
>                 "state": "down+peering",
>                 "last_fresh": "2017-02-27 16:30:16.230519",
>                 "last_change": "2017-02-27 16:30:15.267995",
>                 "last_active": "0.000000",
>                 "last_peered": "0.000000",
>                 "last_clean": "0.000000",
>                 "last_became_active": "0.000000",
>                 "last_became_peered": "0.000000",
>                 "last_unstale": "2017-02-27 16:30:16.230519",
>                 "last_undegraded": "2017-02-27 16:30:16.230519",
>                 "last_fullsized": "2017-02-27 16:30:16.230519",
>                 "mapping_epoch": 60601,
>                 "log_start": "0'0",
>                 "ondisk_log_start": "0'0",
>                 "created": 4,
>                 "last_epoch_clean": 55143,
>                 "parent": "0.0",
>                 "parent_split_bits": 0,
>                 "last_scrub": "2852'33528",
>                 "last_scrub_stamp": "2017-02-26 02:36:55.210150",
>                 "last_deep_scrub": "2852'16480",
>                 "last_deep_scrub_stamp": "2017-02-21 00:14:08.866448",
>                 "last_clean_scrub_stamp": "2017-02-26 02:36:55.210150",
>                 "log_size": 0,
>                 "ondisk_log_size": 0,
>                 "stats_invalid": "0",
>                 "stat_sum": {
>                     "num_bytes": 0,
>                     "num_objects": 0,
>                     "num_object_clones": 0,
>                     "num_object_copies": 0,
>                     "num_objects_missing_on_primary": 0,
>                     "num_objects_degraded": 0,
>                     "num_objects_misplaced": 0,
>                     "num_objects_unfound": 0,
>                     "num_objects_dirty": 0,
>                     "num_whiteouts": 0,
>                     "num_read": 0,
>                     "num_read_kb": 0,
>                     "num_write": 0,
>                     "num_write_kb": 0,
>                     "num_scrub_errors": 0,
>                     "num_shallow_scrub_errors": 0,
>                     "num_deep_scrub_errors": 0,
>                     "num_objects_recovered": 0,
>                     "num_bytes_recovered": 0,
>                     "num_keys_recovered": 0,
>                     "num_objects_omap": 0,
>                     "num_objects_hit_set_archive": 0,
>                     "num_bytes_hit_set_archive": 0
>                 },
>                 "up": [
>                     28,
>                     35,
>                     2
>                 ],
>                 "acting": [
>                     28,
>                     35,
>                     2
>                 ],
>                 "blocked_by": [],
>                 "up_primary": 28,
>                 "acting_primary": 28
>             },
>             "empty": 1,
>             "dne": 0,
>             "incomplete": 0,
>             "last_epoch_started": 0,
>             "hit_set_history": {
>                 "current_last_update": "0'0",
>                 "current_last_stamp": "0.000000",
>                 "current_info": {
>                     "begin": "0.000000",
>                     "end": "0.000000",
>                     "version": "0'0",
>                     "using_gmt": "1"
>                 },
>                 "history": []
>             }
>         },
>
> Where can I read more about the meaning of each parameter, some of them have
> quite self explanatory names, but not all (or probably we need a deeper
> knowledge to understand them).
> Isn't there any parameter that would say when was that OSD assigned to the
> given PG? Also the stat_sum shows 0 for all its parameters. Why is it
> blocking then?
>
> Is there a way to tell the PG to forget about that OSD?
>
> Thank you,
> Laszlo
>
>
> On 10.03.2017 03:05, Brad Hubbard wrote:
>>
>> Can you explain more about what happened?
>>
>> The query shows progress is blocked by the following OSDs.
>>
>>                 "blocked_by": [
>>                     14,
>>                     17,
>>                     51,
>>                     58,
>>                     63,
>>                     64,
>>                     68,
>>                     70
>>                 ],
>>
>> Some of these OSDs are marked as "dne" (Does Not Exist).
>>
>> peer": "17",
>> "dne": 1,
>> "peer": "51",
>> "dne": 1,
>> "peer": "58",
>> "dne": 1,
>> "peer": "64",
>> "dne": 1,
>> "peer": "70",
>> "dne": 1,
>>
>> Can we get a complete background here please?
>>
>>
>> On Thu, Mar 9, 2017 at 10:53 PM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx>
>> wrote:
>>>
>>> Hello,
>>>
>>> After a major network outage our ceph cluster ended up with an inactive
>>> PG:
>>>
>>> # ceph health detail
>>> HEALTH_WARN 1 pgs incomplete; 1 pgs stuck inactive; 1 pgs stuck unclean;
>>> 1
>>> requests are blocked > 32 sec; 1 osds have slow requests
>>> pg 3.367 is stuck inactive for 912263.766607, current state incomplete,
>>> last
>>> acting [28,35,2]
>>> pg 3.367 is stuck unclean for 912263.766688, current state incomplete,
>>> last
>>> acting [28,35,2]
>>> pg 3.367 is incomplete, acting [28,35,2]
>>> 1 ops are blocked > 268435 sec
>>> 1 ops are blocked > 268435 sec on osd.28
>>> 1 osds have slow requests
>>>
>>> # ceph -s
>>>     cluster 6713d1b8-83da-11e6-aa79-525400d98c5a
>>>      health HEALTH_WARN
>>>             1 pgs incomplete
>>>             1 pgs stuck inactive
>>>             1 pgs stuck unclean
>>>             1 requests are blocked > 32 sec
>>>      monmap e3: 3 mons at
>>>
>>> {tv-dl360-1=10.12.193.73:6789/0,tv-dl360-2=10.12.193.74:6789/0,tv-dl360-3=10.12.193.75:6789/0}
>>>             election epoch 72, quorum 0,1,2
>>> tv-dl360-1,tv-dl360-2,tv-dl360-3
>>>      osdmap e60609: 72 osds: 72 up, 72 in
>>>       pgmap v3670252: 4864 pgs, 11 pools, 134 GB data, 23778 objects
>>>             490 GB used, 130 TB / 130 TB avail
>>>                 4863 active+clean
>>>                    1 incomplete
>>>   client io 0 B/s rd, 38465 B/s wr, 2 op/s
>>>
>>> ceph pg repair doesn't change anything. What should I try to recover it?
>>> Attached is the result of ceph pg query on the problem PG.
>>>
>>> Thank you,
>>> Laszlo
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>>
>>
>

-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com