Re: pgs stuck inactive

Brad Hubbard <bhubbard@xxxxxxxxxx> · Thu, 16 Mar 2017 13:26:53 +1000

Can you install the debuginfo for ceph (how this works depends on your
distro) and run the following?

# gdb -ex 'r' -ex 't a a bt full' -ex 'q' --args ceph-objectstore-tool
import-rados volumes pg.3.367.export.OSD.35

On Thu, Mar 16, 2017 at 12:02 AM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx> wrote:
> Hello,
>
>
> the ceph-objectstore-tool import-rados volumes pg.3.367.export.OSD.35
> command crashes.
>
> ~# ceph-objectstore-tool import-rados volumes pg.3.367.export.OSD.35
> *** Caught signal (Segmentation fault) **
>  in thread 7f85b60e28c0
>  ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af)
>  1: ceph-objectstore-tool() [0xaeeaba]
>  2: (()+0x10330) [0x7f85b4dca330]
>  3: (()+0xa2324) [0x7f85b1cd7324]
>  4: (()+0x7d23e) [0x7f85b1cb223e]
>  5: (()+0x7d478) [0x7f85b1cb2478]
>  6: (rados_ioctx_create()+0x32) [0x7f85b1c89f92]
>  7: (librados::Rados::ioctx_create(char const*, librados::IoCtx&)+0x15)
> [0x7f85b1c8a0e5]
>  8: (do_import_rados(std::string, bool)+0xb7c) [0x68199c]
>  9: (main()+0x1294) [0x651134]
>  10: (__libc_start_main()+0xf5) [0x7f85b0c69f45]
>  11: ceph-objectstore-tool() [0x66f8b7]
> 2017-03-15 14:57:05.567987 7f85b60e28c0 -1 *** Caught signal (Segmentation
> fault) **
>  in thread 7f85b60e28c0
>
>  ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af)
>  1: ceph-objectstore-tool() [0xaeeaba]
>  2: (()+0x10330) [0x7f85b4dca330]
>  3: (()+0xa2324) [0x7f85b1cd7324]
>  4: (()+0x7d23e) [0x7f85b1cb223e]
>  5: (()+0x7d478) [0x7f85b1cb2478]
>  6: (rados_ioctx_create()+0x32) [0x7f85b1c89f92]
>  7: (librados::Rados::ioctx_create(char const*, librados::IoCtx&)+0x15)
> [0x7f85b1c8a0e5]
>  8: (do_import_rados(std::string, bool)+0xb7c) [0x68199c]
>  9: (main()+0x1294) [0x651134]
>  10: (__libc_start_main()+0xf5) [0x7f85b0c69f45]
>  11: ceph-objectstore-tool() [0x66f8b7]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
>
> --- begin dump of recent events ---
>    -14> 2017-03-15 14:57:05.557743 7f85b60e28c0  5 asok(0x5632000)
> register_command perfcounters_dump hook 0x55e6130
>    -13> 2017-03-15 14:57:05.557807 7f85b60e28c0  5 asok(0x5632000)
> register_command 1 hook 0x55e6130
>    -12> 2017-03-15 14:57:05.557818 7f85b60e28c0  5 asok(0x5632000)
> register_command perf dump hook 0x55e6130
>    -11> 2017-03-15 14:57:05.557828 7f85b60e28c0  5 asok(0x5632000)
> register_command perfcounters_schema hook 0x55e6130
>    -10> 2017-03-15 14:57:05.557836 7f85b60e28c0  5 asok(0x5632000)
> register_command 2 hook 0x55e6130
>     -9> 2017-03-15 14:57:05.557841 7f85b60e28c0  5 asok(0x5632000)
> register_command perf schema hook 0x55e6130
>     -8> 2017-03-15 14:57:05.557851 7f85b60e28c0  5 asok(0x5632000)
> register_command perf reset hook 0x55e6130
>     -7> 2017-03-15 14:57:05.557855 7f85b60e28c0  5 asok(0x5632000)
> register_command config show hook 0x55e6130
>     -6> 2017-03-15 14:57:05.557864 7f85b60e28c0  5 asok(0x5632000)
> register_command config set hook 0x55e6130
>     -5> 2017-03-15 14:57:05.557868 7f85b60e28c0  5 asok(0x5632000)
> register_command config get hook 0x55e6130
>     -4> 2017-03-15 14:57:05.557877 7f85b60e28c0  5 asok(0x5632000)
> register_command config diff hook 0x55e6130
>     -3> 2017-03-15 14:57:05.557880 7f85b60e28c0  5 asok(0x5632000)
> register_command log flush hook 0x55e6130
>     -2> 2017-03-15 14:57:05.557888 7f85b60e28c0  5 asok(0x5632000)
> register_command log dump hook 0x55e6130
>     -1> 2017-03-15 14:57:05.557892 7f85b60e28c0  5 asok(0x5632000)
> register_command log reopen hook 0x55e6130
>      0> 2017-03-15 14:57:05.567987 7f85b60e28c0 -1 *** Caught signal
> (Segmentation fault) **
>  in thread 7f85b60e28c0
>
>  ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af)
>  1: ceph-objectstore-tool() [0xaeeaba]
>  2: (()+0x10330) [0x7f85b4dca330]
>  3: (()+0xa2324) [0x7f85b1cd7324]
>  4: (()+0x7d23e) [0x7f85b1cb223e]
>  5: (()+0x7d478) [0x7f85b1cb2478]
>  6: (rados_ioctx_create()+0x32) [0x7f85b1c89f92]
>  7: (librados::Rados::ioctx_create(char const*, librados::IoCtx&)+0x15)
> [0x7f85b1c8a0e5]
>  8: (do_import_rados(std::string, bool)+0xb7c) [0x68199c]
>  9: (main()+0x1294) [0x651134]
>  10: (__libc_start_main()+0xf5) [0x7f85b0c69f45]
>  11: ceph-objectstore-tool() [0x66f8b7]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
>
> --- logging levels ---
>    0/ 5 none
>    0/ 1 lockdep
>    0/ 1 context
>    1/ 1 crush
>    1/ 5 mds
>    1/ 5 mds_balancer
>    1/ 5 mds_locker
>    1/ 5 mds_log
>    1/ 5 mds_log_expire
>    1/ 5 mds_migrator
>    0/ 1 buffer
>    0/ 1 timer
>    0/ 1 filer
>    0/ 1 striper
>    0/ 1 objecter
>    0/ 5 rados
>    0/ 5 rbd
>    0/ 5 rbd_replay
>    0/ 5 journaler
>    0/ 5 objectcacher
>    0/ 5 client
>    0/ 5 osd
>    0/ 5 optracker
>    0/ 5 objclass
>    1/ 3 filestore
>    1/ 3 keyvaluestore
>    1/ 3 journal
>    0/ 5 ms
>    1/ 5 mon
>    0/10 monc
>    1/ 5 paxos
>    0/ 5 tp
>    1/ 5 auth
>    1/ 5 crypto
>    1/ 1 finisher
>    1/ 5 heartbeatmap
>    1/ 5 perfcounter
>    1/ 5 rgw
>    1/10 civetweb
>    1/ 5 javaclient
>    1/ 5 asok
>    1/ 1 throttle
>    0/ 0 refs
>    1/ 5 xio
>   -2/-2 (syslog threshold)
>   99/99 (stderr threshold)
>   max_recent       500
>   max_new         1000
>   log_file
> --- end dump of recent events ---
> Segmentation fault (core dumped)
> #
>
> Any ideas what to try?
>
> Thank you.
> Laszlo
>
>
> On 15.03.2017 04:27, Brad Hubbard wrote:
>>
>> Decide which copy you want to keep and export that with
>> ceph-objectstore-tool
>>
>> Delete all copies on all OSDs with ceph-objectstore-tool (not by
>> deleting the directory on the disk).
>>
>> Use force_create_pg to recreate the pg empty.
>>
>> Use ceph-objectstore-tool to do a rados import on the exported pg copy.
>>
>>
>> On Wed, Mar 15, 2017 at 12:00 PM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx>
>> wrote:
>>>
>>> Hello,
>>>
>>> I have tried to recover the pg using the following steps:
>>> Preparation:
>>> 1. set noout
>>> 2. stop osd.2
>>> 3. use ceph-objectstore-tool to export from osd2
>>> 4. start osd.2
>>> 5. repeat step 2-4 on osd 35,28, 63 (I've done these hoping to be able to
>>> use one of those exports to recover the PG)
>>>
>>>
>>> First attempt:
>>>
>>> 1. stop osd.2
>>> 2. remove the 3.367_head directory
>>> 3. start osd.2
>>> Here I was hoping that the cluster will recover the pg from the 2 other
>>> identical osds. It did NOT. So I have tried the following commands on the
>>> PG:
>>> ceph pg repair
>>> ceph pg scrub
>>> ceph pg deep-scrub
>>> ceph pg force_create_pg
>>>  nothing changed. My PG was still incomplete. So I tried to remove all
>>> the
>>> OSDs that were referenced in the pg query:
>>>
>>>
>>> 1. stop osd.2
>>> 2. delete the 3.367_head directory
>>> 3. start osd2
>>> 4 repeat steps 6-8 for all the OSDs that were listed in the pg query
>>> 5. did an import from one of the exports. -> I was able again to query
>>> the
>>> pg (that was impossible when all the 3.367_head dirs were deleted) and
>>> the
>>> stats were saying that the number of objects is 6 the size is 21M (all
>>> correct values according to the files I was able to see before starting
>>> the
>>> procedure) But the PG is still incomplete.
>>>
>>> What else can I try?
>>>
>>> Thank you,
>>> Laszlo
>>>
>>>
>>>
>>>
>>>
>>> On 12.03.2017 13:06, Brad Hubbard wrote:
>>>>
>>>>
>>>> On Sun, Mar 12, 2017 at 7:51 PM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx>
>>>> wrote:
>>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> I have already done the export with ceph_objectstore_tool. I just have
>>>>> to
>>>>> decide which OSDs to keep.
>>>>> Can you tell me why the directory structure in the OSDs is different
>>>>> for
>>>>> the
>>>>> same PG when checking on different OSDs?
>>>>> For instance, in OSD 2 and 63 there are NO subdirectories in the
>>>>> 3.367__head, while OSD 28, 35 contains
>>>>> ./DIR_7/DIR_6/DIR_B/
>>>>> ./DIR_7/DIR_6/DIR_3/
>>>>>
>>>>> When are these subdirectories created?
>>>>>
>>>>> The files are identical on all the OSDs, only the way how these are
>>>>> stored
>>>>> is different. It would be enough if you could point me to some
>>>>> documentation
>>>>> that explain these, I'll read it. So far, searching for the
>>>>> architecture
>>>>> of
>>>>> an OSD, I could not find the gory details about these directories.
>>>>
>>>>
>>>>
>>>> https://github.com/ceph/ceph/blob/master/src/os/filestore/HashIndex.h
>>>>
>>>>>
>>>>> Kind regards,
>>>>> Laszlo
>>>>>
>>>>>
>>>>> On 12.03.2017 02:12, Brad Hubbard wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, Mar 11, 2017 at 7:43 PM, Laszlo Budai
>>>>>> <laszlo@xxxxxxxxxxxxxxxx>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> Thank you for your answer.
>>>>>>>
>>>>>>> indeed the min_size is 1:
>>>>>>>
>>>>>>> # ceph osd pool get volumes size
>>>>>>> size: 3
>>>>>>> # ceph osd pool get volumes min_size
>>>>>>> min_size: 1
>>>>>>> #
>>>>>>> I'm gonna try to find the mentioned discussions on the mailing lists,
>>>>>>> and
>>>>>>> read them. If you have a link at hand, that would be nice if you
>>>>>>> would
>>>>>>> send
>>>>>>> it to me.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> This thread is one example, there are lots more.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-December/014846.html
>>>>>>
>>>>>>>
>>>>>>> In the attached file you can see the contents of the directory
>>>>>>> containing
>>>>>>> PG
>>>>>>> data on the different OSDs (all that have appeared in the pg query).
>>>>>>> According to the md5sums the files are identical. What bothers me is
>>>>>>> the
>>>>>>> directory structure (you can see the ls -R in each dir that contains
>>>>>>> files).
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> So I mixed up 63 and 68, my list should have read 2, 28, 35 and 63
>>>>>> since 68 is listed as empty in the pg query.
>>>>>>
>>>>>>>
>>>>>>> Where can I read about how/why those DIR# subdirectories have
>>>>>>> appeared?
>>>>>>>
>>>>>>> Given that the files themselves are identical on the "current" OSDs
>>>>>>> belonging to the PG, and as the osd.63 (currently not belonging to
>>>>>>> the
>>>>>>> PG)
>>>>>>> has the same files, is it safe to stop the OSD.2, remove the
>>>>>>> 3.367_head
>>>>>>> dir,
>>>>>>> and then restart the OSD? (all these with the noout flag set of
>>>>>>> course)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *You* need to decide which is the "good" copy and then follow the
>>>>>> instructions in the links I provided to try and recover the pg. Back
>>>>>> those known copies on 2, 28, 35 and 63 up with the
>>>>>> ceph_objectstore_tool before proceeding. They may well be identical
>>>>>> but the peering process still needs to "see" the relevant logs and
>>>>>> currently something is stopping it doing so.
>>>>>>
>>>>>>>
>>>>>>> Kind regards,
>>>>>>> Laszlo
>>>>>>>
>>>>>>>
>>>>>>> On 11.03.2017 00:32, Brad Hubbard wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> So this is why it happened I guess.
>>>>>>>>
>>>>>>>> pool 3 'volumes' replicated size 3 min_size 1
>>>>>>>>
>>>>>>>> min_size = 1 is a recipe for disasters like this and there are
>>>>>>>> plenty
>>>>>>>> of ML threads about not setting it below 2.
>>>>>>>>
>>>>>>>> The past intervals in the pg query show several intervals where a
>>>>>>>> single OSD may have gone rw.
>>>>>>>>
>>>>>>>> How important is this data?
>>>>>>>>
>>>>>>>> I would suggest checking which of these OSDs actually have the data
>>>>>>>> for this pg. From the pg query it looks like 2, 35 and 68 and
>>>>>>>> possibly
>>>>>>>> 28 since it's the primary. Check all OSDs in the pg query output. I
>>>>>>>> would then back up all copies and work out which copy, if any, you
>>>>>>>> want to keep and then attempt something like the following.
>>>>>>>>
>>>>>>>> https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg17820.html
>>>>>>>>
>>>>>>>> If you want to abandon the pg see
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-September/012778.html
>>>>>>>> for a possible solution.
>>>>>>>>
>>>>>>>> http://ceph.com/community/incomplete-pgs-oh-my/ may also give some
>>>>>>>> ideas.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Mar 10, 2017 at 9:44 PM, Laszlo Budai
>>>>>>>> <laszlo@xxxxxxxxxxxxxxxx>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The OSDs are all there.
>>>>>>>>>
>>>>>>>>> $ sudo ceph osd stat
>>>>>>>>>      osdmap e60609: 72 osds: 72 up, 72 in
>>>>>>>>>
>>>>>>>>> an I have attached the result of ceph osd tree, and ceph osd dump
>>>>>>>>> commands.
>>>>>>>>> I got some extra info about the network problem. A faulty network
>>>>>>>>> device
>>>>>>>>> has
>>>>>>>>> flooded the network eating up all the bandwidth so the OSDs were
>>>>>>>>> not
>>>>>>>>> able
>>>>>>>>> to
>>>>>>>>> properly communicate with each other. This has lasted for almost 1
>>>>>>>>> day.
>>>>>>>>>
>>>>>>>>> Thank you,
>>>>>>>>> Laszlo
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 10.03.2017 12:19, Brad Hubbard wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> To me it looks like someone may have done an "rm" on these OSDs
>>>>>>>>>> but
>>>>>>>>>> not removed them from the crushmap. This does not happen
>>>>>>>>>> automatically.
>>>>>>>>>>
>>>>>>>>>> Do these OSDs show up in "ceph osd tree" and "ceph osd dump" ? If
>>>>>>>>>> so,
>>>>>>>>>> paste the output.
>>>>>>>>>>
>>>>>>>>>> Without knowing what exactly happened here it may be difficult to
>>>>>>>>>> work
>>>>>>>>>> out how to proceed.
>>>>>>>>>>
>>>>>>>>>> In order to go clean the primary needs to communicate with
>>>>>>>>>> multiple
>>>>>>>>>> OSDs, some of which are marked DNE and seem to be uncontactable.
>>>>>>>>>>
>>>>>>>>>> This seems to be more than a network issue (unless the outage is
>>>>>>>>>> still
>>>>>>>>>> happening).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> http://docs.ceph.com/docs/master/rados/operations/pg-states/?highlight=incomplete
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Mar 10, 2017 at 6:09 PM, Laszlo Budai
>>>>>>>>>> <laszlo@xxxxxxxxxxxxxxxx>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>> I was informed that due to a networking issue the ceph cluster
>>>>>>>>>>> network
>>>>>>>>>>> was
>>>>>>>>>>> affected. There was a huge packet loss, and network interfaces
>>>>>>>>>>> were
>>>>>>>>>>> flipping. That's all I got.
>>>>>>>>>>> This outage has lasted a longer period of time. So I assume that
>>>>>>>>>>> some
>>>>>>>>>>> OSD
>>>>>>>>>>> may have been considered dead and the data from them has been
>>>>>>>>>>> moved
>>>>>>>>>>> away
>>>>>>>>>>> to
>>>>>>>>>>> other PGs (this is what ceph is supposed to do if I'm correct).
>>>>>>>>>>> Probably
>>>>>>>>>>> that was the point when the listed PGs have appeared into the
>>>>>>>>>>> picture.
>>>>>>>>>>> From the query we can see this for one of those OSDs:
>>>>>>>>>>>         {
>>>>>>>>>>>             "peer": "14",
>>>>>>>>>>>             "pgid": "3.367",
>>>>>>>>>>>             "last_update": "0'0",
>>>>>>>>>>>             "last_complete": "0'0",
>>>>>>>>>>>             "log_tail": "0'0",
>>>>>>>>>>>             "last_user_version": 0,
>>>>>>>>>>>             "last_backfill": "MAX",
>>>>>>>>>>>             "purged_snaps": "[]",
>>>>>>>>>>>             "history": {
>>>>>>>>>>>                 "epoch_created": 4,
>>>>>>>>>>>                 "last_epoch_started": 54899,
>>>>>>>>>>>                 "last_epoch_clean": 55143,
>>>>>>>>>>>                 "last_epoch_split": 0,
>>>>>>>>>>>                 "same_up_since": 60603,
>>>>>>>>>>>                 "same_interval_since": 60603,
>>>>>>>>>>>                 "same_primary_since": 60593,
>>>>>>>>>>>                 "last_scrub": "2852'33528",
>>>>>>>>>>>                 "last_scrub_stamp": "2017-02-26 02:36:55.210150",
>>>>>>>>>>>                 "last_deep_scrub": "2852'16480",
>>>>>>>>>>>                 "last_deep_scrub_stamp": "2017-02-21
>>>>>>>>>>> 00:14:08.866448",
>>>>>>>>>>>                 "last_clean_scrub_stamp": "2017-02-26
>>>>>>>>>>> 02:36:55.210150"
>>>>>>>>>>>             },
>>>>>>>>>>>             "stats": {
>>>>>>>>>>>                 "version": "0'0",
>>>>>>>>>>>                 "reported_seq": "14",
>>>>>>>>>>>                 "reported_epoch": "59779",
>>>>>>>>>>>                 "state": "down+peering",
>>>>>>>>>>>                 "last_fresh": "2017-02-27 16:30:16.230519",
>>>>>>>>>>>                 "last_change": "2017-02-27 16:30:15.267995",
>>>>>>>>>>>                 "last_active": "0.000000",
>>>>>>>>>>>                 "last_peered": "0.000000",
>>>>>>>>>>>                 "last_clean": "0.000000",
>>>>>>>>>>>                 "last_became_active": "0.000000",
>>>>>>>>>>>                 "last_became_peered": "0.000000",
>>>>>>>>>>>                 "last_unstale": "2017-02-27 16:30:16.230519",
>>>>>>>>>>>                 "last_undegraded": "2017-02-27 16:30:16.230519",
>>>>>>>>>>>                 "last_fullsized": "2017-02-27 16:30:16.230519",
>>>>>>>>>>>                 "mapping_epoch": 60601,
>>>>>>>>>>>                 "log_start": "0'0",
>>>>>>>>>>>                 "ondisk_log_start": "0'0",
>>>>>>>>>>>                 "created": 4,
>>>>>>>>>>>                 "last_epoch_clean": 55143,
>>>>>>>>>>>                 "parent": "0.0",
>>>>>>>>>>>                 "parent_split_bits": 0,
>>>>>>>>>>>                 "last_scrub": "2852'33528",
>>>>>>>>>>>                 "last_scrub_stamp": "2017-02-26 02:36:55.210150",
>>>>>>>>>>>                 "last_deep_scrub": "2852'16480",
>>>>>>>>>>>                 "last_deep_scrub_stamp": "2017-02-21
>>>>>>>>>>> 00:14:08.866448",
>>>>>>>>>>>                 "last_clean_scrub_stamp": "2017-02-26
>>>>>>>>>>> 02:36:55.210150",
>>>>>>>>>>>                 "log_size": 0,
>>>>>>>>>>>                 "ondisk_log_size": 0,
>>>>>>>>>>>                 "stats_invalid": "0",
>>>>>>>>>>>                 "stat_sum": {
>>>>>>>>>>>                     "num_bytes": 0,
>>>>>>>>>>>                     "num_objects": 0,
>>>>>>>>>>>                     "num_object_clones": 0,
>>>>>>>>>>>                     "num_object_copies": 0,
>>>>>>>>>>>                     "num_objects_missing_on_primary": 0,
>>>>>>>>>>>                     "num_objects_degraded": 0,
>>>>>>>>>>>                     "num_objects_misplaced": 0,
>>>>>>>>>>>                     "num_objects_unfound": 0,
>>>>>>>>>>>                     "num_objects_dirty": 0,
>>>>>>>>>>>                     "num_whiteouts": 0,
>>>>>>>>>>>                     "num_read": 0,
>>>>>>>>>>>                     "num_read_kb": 0,
>>>>>>>>>>>                     "num_write": 0,
>>>>>>>>>>>                     "num_write_kb": 0,
>>>>>>>>>>>                     "num_scrub_errors": 0,
>>>>>>>>>>>                     "num_shallow_scrub_errors": 0,
>>>>>>>>>>>                     "num_deep_scrub_errors": 0,
>>>>>>>>>>>                     "num_objects_recovered": 0,
>>>>>>>>>>>                     "num_bytes_recovered": 0,
>>>>>>>>>>>                     "num_keys_recovered": 0,
>>>>>>>>>>>                     "num_objects_omap": 0,
>>>>>>>>>>>                     "num_objects_hit_set_archive": 0,
>>>>>>>>>>>                     "num_bytes_hit_set_archive": 0
>>>>>>>>>>>                 },
>>>>>>>>>>>                 "up": [
>>>>>>>>>>>                     28,
>>>>>>>>>>>                     35,
>>>>>>>>>>>                     2
>>>>>>>>>>>                 ],
>>>>>>>>>>>                 "acting": [
>>>>>>>>>>>                     28,
>>>>>>>>>>>                     35,
>>>>>>>>>>>                     2
>>>>>>>>>>>                 ],
>>>>>>>>>>>                 "blocked_by": [],
>>>>>>>>>>>                 "up_primary": 28,
>>>>>>>>>>>                 "acting_primary": 28
>>>>>>>>>>>             },
>>>>>>>>>>>             "empty": 1,
>>>>>>>>>>>             "dne": 0,
>>>>>>>>>>>             "incomplete": 0,
>>>>>>>>>>>             "last_epoch_started": 0,
>>>>>>>>>>>             "hit_set_history": {
>>>>>>>>>>>                 "current_last_update": "0'0",
>>>>>>>>>>>                 "current_last_stamp": "0.000000",
>>>>>>>>>>>                 "current_info": {
>>>>>>>>>>>                     "begin": "0.000000",
>>>>>>>>>>>                     "end": "0.000000",
>>>>>>>>>>>                     "version": "0'0",
>>>>>>>>>>>                     "using_gmt": "1"
>>>>>>>>>>>                 },
>>>>>>>>>>>                 "history": []
>>>>>>>>>>>             }
>>>>>>>>>>>         },
>>>>>>>>>>>
>>>>>>>>>>> Where can I read more about the meaning of each parameter, some
>>>>>>>>>>> of
>>>>>>>>>>> them
>>>>>>>>>>> have
>>>>>>>>>>> quite self explanatory names, but not all (or probably we need a
>>>>>>>>>>> deeper
>>>>>>>>>>> knowledge to understand them).
>>>>>>>>>>> Isn't there any parameter that would say when was that OSD
>>>>>>>>>>> assigned
>>>>>>>>>>> to
>>>>>>>>>>> the
>>>>>>>>>>> given PG? Also the stat_sum shows 0 for all its parameters. Why
>>>>>>>>>>> is
>>>>>>>>>>> it
>>>>>>>>>>> blocking then?
>>>>>>>>>>>
>>>>>>>>>>> Is there a way to tell the PG to forget about that OSD?
>>>>>>>>>>>
>>>>>>>>>>> Thank you,
>>>>>>>>>>> Laszlo
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 10.03.2017 03:05, Brad Hubbard wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Can you explain more about what happened?
>>>>>>>>>>>>
>>>>>>>>>>>> The query shows progress is blocked by the following OSDs.
>>>>>>>>>>>>
>>>>>>>>>>>>                 "blocked_by": [
>>>>>>>>>>>>                     14,
>>>>>>>>>>>>                     17,
>>>>>>>>>>>>                     51,
>>>>>>>>>>>>                     58,
>>>>>>>>>>>>                     63,
>>>>>>>>>>>>                     64,
>>>>>>>>>>>>                     68,
>>>>>>>>>>>>                     70
>>>>>>>>>>>>                 ],
>>>>>>>>>>>>
>>>>>>>>>>>> Some of these OSDs are marked as "dne" (Does Not Exist).
>>>>>>>>>>>>
>>>>>>>>>>>> peer": "17",
>>>>>>>>>>>> "dne": 1,
>>>>>>>>>>>> "peer": "51",
>>>>>>>>>>>> "dne": 1,
>>>>>>>>>>>> "peer": "58",
>>>>>>>>>>>> "dne": 1,
>>>>>>>>>>>> "peer": "64",
>>>>>>>>>>>> "dne": 1,
>>>>>>>>>>>> "peer": "70",
>>>>>>>>>>>> "dne": 1,
>>>>>>>>>>>>
>>>>>>>>>>>> Can we get a complete background here please?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Mar 9, 2017 at 10:53 PM, Laszlo Budai
>>>>>>>>>>>> <laszlo@xxxxxxxxxxxxxxxx>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>
>>>>>>>>>>>>> After a major network outage our ceph cluster ended up with an
>>>>>>>>>>>>> inactive
>>>>>>>>>>>>> PG:
>>>>>>>>>>>>>
>>>>>>>>>>>>> # ceph health detail
>>>>>>>>>>>>> HEALTH_WARN 1 pgs incomplete; 1 pgs stuck inactive; 1 pgs stuck
>>>>>>>>>>>>> unclean;
>>>>>>>>>>>>> 1
>>>>>>>>>>>>> requests are blocked > 32 sec; 1 osds have slow requests
>>>>>>>>>>>>> pg 3.367 is stuck inactive for 912263.766607, current state
>>>>>>>>>>>>> incomplete,
>>>>>>>>>>>>> last
>>>>>>>>>>>>> acting [28,35,2]
>>>>>>>>>>>>> pg 3.367 is stuck unclean for 912263.766688, current state
>>>>>>>>>>>>> incomplete,
>>>>>>>>>>>>> last
>>>>>>>>>>>>> acting [28,35,2]
>>>>>>>>>>>>> pg 3.367 is incomplete, acting [28,35,2]
>>>>>>>>>>>>> 1 ops are blocked > 268435 sec
>>>>>>>>>>>>> 1 ops are blocked > 268435 sec on osd.28
>>>>>>>>>>>>> 1 osds have slow requests
>>>>>>>>>>>>>
>>>>>>>>>>>>> # ceph -s
>>>>>>>>>>>>>     cluster 6713d1b8-83da-11e6-aa79-525400d98c5a
>>>>>>>>>>>>>      health HEALTH_WARN
>>>>>>>>>>>>>             1 pgs incomplete
>>>>>>>>>>>>>             1 pgs stuck inactive
>>>>>>>>>>>>>             1 pgs stuck unclean
>>>>>>>>>>>>>             1 requests are blocked > 32 sec
>>>>>>>>>>>>>      monmap e3: 3 mons at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> {tv-dl360-1=10.12.193.73:6789/0,tv-dl360-2=10.12.193.74:6789/0,tv-dl360-3=10.12.193.75:6789/0}
>>>>>>>>>>>>>             election epoch 72, quorum 0,1,2
>>>>>>>>>>>>> tv-dl360-1,tv-dl360-2,tv-dl360-3
>>>>>>>>>>>>>      osdmap e60609: 72 osds: 72 up, 72 in
>>>>>>>>>>>>>       pgmap v3670252: 4864 pgs, 11 pools, 134 GB data, 23778
>>>>>>>>>>>>> objects
>>>>>>>>>>>>>             490 GB used, 130 TB / 130 TB avail
>>>>>>>>>>>>>                 4863 active+clean
>>>>>>>>>>>>>                    1 incomplete
>>>>>>>>>>>>>   client io 0 B/s rd, 38465 B/s wr, 2 op/s
>>>>>>>>>>>>>
>>>>>>>>>>>>> ceph pg repair doesn't change anything. What should I try to
>>>>>>>>>>>>> recover
>>>>>>>>>>>>> it?
>>>>>>>>>>>>> Attached is the result of ceph pg query on the problem PG.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you,
>>>>>>>>>>>>> Laszlo
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> ceph-users mailing list
>>>>>>>>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>

-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com