Re: pgs stuck inactive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

Thank you for your answer.

indeed the min_size is 1:

# ceph osd pool get volumes size
size: 3
# ceph osd pool get volumes min_size
min_size: 1
#
I'm gonna try to find the mentioned discussions on the mailing lists, and read them. If you have a link at hand, that would be nice if you would send it to me.

In the attached file you can see the contents of the directory containing PG data on the different OSDs (all that have appeared in the pg query).
According to the md5sums the files are identical. What bothers me is the directory structure (you can see the ls -R in each dir that contains files).

Where can I read about how/why those DIR# subdirectories have appeared?

Given that the files themselves are identical on the "current" OSDs belonging to the PG, and as the osd.63 (currently not belonging to the PG) has the same files, is it safe to stop the OSD.2, remove the 3.367_head dir, and then restart the OSD? (all these with the noout flag set of course)

Kind regards,
Laszlo

On 11.03.2017 00:32, Brad Hubbard wrote:
So this is why it happened I guess.

pool 3 'volumes' replicated size 3 min_size 1

min_size = 1 is a recipe for disasters like this and there are plenty
of ML threads about not setting it below 2.

The past intervals in the pg query show several intervals where a
single OSD may have gone rw.

How important is this data?

I would suggest checking which of these OSDs actually have the data
for this pg. From the pg query it looks like 2, 35 and 68 and possibly
28 since it's the primary. Check all OSDs in the pg query output. I
would then back up all copies and work out which copy, if any, you
want to keep and then attempt something like the following.

https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg17820.html

If you want to abandon the pg see
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-September/012778.html
for a possible solution.

http://ceph.com/community/incomplete-pgs-oh-my/ may also give some ideas.


On Fri, Mar 10, 2017 at 9:44 PM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx> wrote:
The OSDs are all there.

$ sudo ceph osd stat
     osdmap e60609: 72 osds: 72 up, 72 in

an I have attached the result of ceph osd tree, and ceph osd dump commands.
I got some extra info about the network problem. A faulty network device has
flooded the network eating up all the bandwidth so the OSDs were not able to
properly communicate with each other. This has lasted for almost 1 day.

Thank you,
Laszlo



On 10.03.2017 12:19, Brad Hubbard wrote:

To me it looks like someone may have done an "rm" on these OSDs but
not removed them from the crushmap. This does not happen
automatically.

Do these OSDs show up in "ceph osd tree" and "ceph osd dump" ? If so,
paste the output.

Without knowing what exactly happened here it may be difficult to work
out how to proceed.

In order to go clean the primary needs to communicate with multiple
OSDs, some of which are marked DNE and seem to be uncontactable.

This seems to be more than a network issue (unless the outage is still
happening).


http://docs.ceph.com/docs/master/rados/operations/pg-states/?highlight=incomplete



On Fri, Mar 10, 2017 at 6:09 PM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx>
wrote:

Hello,

I was informed that due to a networking issue the ceph cluster network
was
affected. There was a huge packet loss, and network interfaces were
flipping. That's all I got.
This outage has lasted a longer period of time. So I assume that some OSD
may have been considered dead and the data from them has been moved away
to
other PGs (this is what ceph is supposed to do if I'm correct). Probably
that was the point when the listed PGs have appeared into the picture.
From the query we can see this for one of those OSDs:
        {
            "peer": "14",
            "pgid": "3.367",
            "last_update": "0'0",
            "last_complete": "0'0",
            "log_tail": "0'0",
            "last_user_version": 0,
            "last_backfill": "MAX",
            "purged_snaps": "[]",
            "history": {
                "epoch_created": 4,
                "last_epoch_started": 54899,
                "last_epoch_clean": 55143,
                "last_epoch_split": 0,
                "same_up_since": 60603,
                "same_interval_since": 60603,
                "same_primary_since": 60593,
                "last_scrub": "2852'33528",
                "last_scrub_stamp": "2017-02-26 02:36:55.210150",
                "last_deep_scrub": "2852'16480",
                "last_deep_scrub_stamp": "2017-02-21 00:14:08.866448",
                "last_clean_scrub_stamp": "2017-02-26 02:36:55.210150"
            },
            "stats": {
                "version": "0'0",
                "reported_seq": "14",
                "reported_epoch": "59779",
                "state": "down+peering",
                "last_fresh": "2017-02-27 16:30:16.230519",
                "last_change": "2017-02-27 16:30:15.267995",
                "last_active": "0.000000",
                "last_peered": "0.000000",
                "last_clean": "0.000000",
                "last_became_active": "0.000000",
                "last_became_peered": "0.000000",
                "last_unstale": "2017-02-27 16:30:16.230519",
                "last_undegraded": "2017-02-27 16:30:16.230519",
                "last_fullsized": "2017-02-27 16:30:16.230519",
                "mapping_epoch": 60601,
                "log_start": "0'0",
                "ondisk_log_start": "0'0",
                "created": 4,
                "last_epoch_clean": 55143,
                "parent": "0.0",
                "parent_split_bits": 0,
                "last_scrub": "2852'33528",
                "last_scrub_stamp": "2017-02-26 02:36:55.210150",
                "last_deep_scrub": "2852'16480",
                "last_deep_scrub_stamp": "2017-02-21 00:14:08.866448",
                "last_clean_scrub_stamp": "2017-02-26 02:36:55.210150",
                "log_size": 0,
                "ondisk_log_size": 0,
                "stats_invalid": "0",
                "stat_sum": {
                    "num_bytes": 0,
                    "num_objects": 0,
                    "num_object_clones": 0,
                    "num_object_copies": 0,
                    "num_objects_missing_on_primary": 0,
                    "num_objects_degraded": 0,
                    "num_objects_misplaced": 0,
                    "num_objects_unfound": 0,
                    "num_objects_dirty": 0,
                    "num_whiteouts": 0,
                    "num_read": 0,
                    "num_read_kb": 0,
                    "num_write": 0,
                    "num_write_kb": 0,
                    "num_scrub_errors": 0,
                    "num_shallow_scrub_errors": 0,
                    "num_deep_scrub_errors": 0,
                    "num_objects_recovered": 0,
                    "num_bytes_recovered": 0,
                    "num_keys_recovered": 0,
                    "num_objects_omap": 0,
                    "num_objects_hit_set_archive": 0,
                    "num_bytes_hit_set_archive": 0
                },
                "up": [
                    28,
                    35,
                    2
                ],
                "acting": [
                    28,
                    35,
                    2
                ],
                "blocked_by": [],
                "up_primary": 28,
                "acting_primary": 28
            },
            "empty": 1,
            "dne": 0,
            "incomplete": 0,
            "last_epoch_started": 0,
            "hit_set_history": {
                "current_last_update": "0'0",
                "current_last_stamp": "0.000000",
                "current_info": {
                    "begin": "0.000000",
                    "end": "0.000000",
                    "version": "0'0",
                    "using_gmt": "1"
                },
                "history": []
            }
        },

Where can I read more about the meaning of each parameter, some of them
have
quite self explanatory names, but not all (or probably we need a deeper
knowledge to understand them).
Isn't there any parameter that would say when was that OSD assigned to
the
given PG? Also the stat_sum shows 0 for all its parameters. Why is it
blocking then?

Is there a way to tell the PG to forget about that OSD?

Thank you,
Laszlo


On 10.03.2017 03:05, Brad Hubbard wrote:


Can you explain more about what happened?

The query shows progress is blocked by the following OSDs.

                "blocked_by": [
                    14,
                    17,
                    51,
                    58,
                    63,
                    64,
                    68,
                    70
                ],

Some of these OSDs are marked as "dne" (Does Not Exist).

peer": "17",
"dne": 1,
"peer": "51",
"dne": 1,
"peer": "58",
"dne": 1,
"peer": "64",
"dne": 1,
"peer": "70",
"dne": 1,

Can we get a complete background here please?


On Thu, Mar 9, 2017 at 10:53 PM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx>
wrote:


Hello,

After a major network outage our ceph cluster ended up with an inactive
PG:

# ceph health detail
HEALTH_WARN 1 pgs incomplete; 1 pgs stuck inactive; 1 pgs stuck
unclean;
1
requests are blocked > 32 sec; 1 osds have slow requests
pg 3.367 is stuck inactive for 912263.766607, current state incomplete,
last
acting [28,35,2]
pg 3.367 is stuck unclean for 912263.766688, current state incomplete,
last
acting [28,35,2]
pg 3.367 is incomplete, acting [28,35,2]
1 ops are blocked > 268435 sec
1 ops are blocked > 268435 sec on osd.28
1 osds have slow requests

# ceph -s
    cluster 6713d1b8-83da-11e6-aa79-525400d98c5a
     health HEALTH_WARN
            1 pgs incomplete
            1 pgs stuck inactive
            1 pgs stuck unclean
            1 requests are blocked > 32 sec
     monmap e3: 3 mons at


{tv-dl360-1=10.12.193.73:6789/0,tv-dl360-2=10.12.193.74:6789/0,tv-dl360-3=10.12.193.75:6789/0}
            election epoch 72, quorum 0,1,2
tv-dl360-1,tv-dl360-2,tv-dl360-3
     osdmap e60609: 72 osds: 72 up, 72 in
      pgmap v3670252: 4864 pgs, 11 pools, 134 GB data, 23778 objects
            490 GB used, 130 TB / 130 TB avail
                4863 active+clean
                   1 incomplete
  client io 0 B/s rd, 38465 B/s wr, 2 op/s

ceph pg repair doesn't change anything. What should I try to recover
it?
Attached is the result of ceph pg query on the problem PG.

Thank you,
Laszlo

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com












============================================== OSD 2 =======================================================

/var/lib/ceph/osd/ceph-2/current/3.367_head$ find . -type f -exec md5sum {} \;
d41d8cd98f00b204e9800998ecf8427e  ./__head_00000367__3
\b5cfa9d6c8febd618f91ac2843d50a1c  ./rbd\\udata.69c9916fa90431.0000000000001001__head_CC598367__3
\2a0e6600c78bafd5adcfdb3c406c74fd  ./rbd\\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3
\b5cfa9d6c8febd618f91ac2843d50a1c  ./rbd\\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3
\40192c0d629399d48c4ea150d5cdfefe  ./rbd\\udata.674340385edcf2.0000000000000608__head_ABD14B67__3
\b75539f7512afd713f28279006f6af34  ./rbd\\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3
\b5cfa9d6c8febd618f91ac2843d50a1c  ./rbd\\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3
/var/lib/ceph/osd/ceph-2/current/3.367_head$

/var/lib/ceph/osd/ceph-2/current/3.367_head$ ls -R
.:
__head_00000367__3                                           rbd\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3
rbd\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3  rbd\udata.69c9916fa90431.0000000000001001__head_CC598367__3
rbd\udata.674340385edcf2.0000000000000608__head_ABD14B67__3  rbd\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3
rbd\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3
/var/lib/ceph/osd/ceph-2/current/3.367_head$


============================================== OSD 28 =======================================================

/var/lib/ceph/osd/ceph-28/current/3.367_head$ find . -type f -exec md5sum {} \;
d41d8cd98f00b204e9800998ecf8427e  ./DIR_7/DIR_6/DIR_3/__head_00000367__3
\2a0e6600c78bafd5adcfdb3c406c74fd  ./DIR_7/DIR_6/DIR_3/rbd\\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3
\b5cfa9d6c8febd618f91ac2843d50a1c  ./DIR_7/DIR_6/DIR_3/rbd\\udata.69c9916fa90431.0000000000001001__head_CC598367__3
\40192c0d629399d48c4ea150d5cdfefe  ./DIR_7/DIR_6/DIR_B/rbd\\udata.674340385edcf2.0000000000000608__head_ABD14B67__3
\b75539f7512afd713f28279006f6af34  ./DIR_7/DIR_6/DIR_B/rbd\\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3
\b5cfa9d6c8febd618f91ac2843d50a1c  ./DIR_7/DIR_6/DIR_B/rbd\\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3
\b5cfa9d6c8febd618f91ac2843d50a1c  ./DIR_7/DIR_6/DIR_B/rbd\\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3
/var/lib/ceph/osd/ceph-28/current/3.367_head$

/var/lib/ceph/osd/ceph-28/current/3.367_head$ ls -R
.:
DIR_7

./DIR_7:
DIR_6

./DIR_7/DIR_6:
DIR_3  DIR_B

./DIR_7/DIR_6/DIR_3:
__head_00000367__3  rbd\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3  rbd\udata.69c9916fa90431.0000000000001001__head_CC598367__3

./DIR_7/DIR_6/DIR_B:
rbd\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3  rbd\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3
rbd\udata.674340385edcf2.0000000000000608__head_ABD14B67__3  rbd\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3
/var/lib/ceph/osd/ceph-28/current/3.367_head$


============================================== OSD 35 =======================================================

/var/lib/ceph/osd/ceph-35/current/3.367_head$ find . -type f -exec md5sum {} \;
d41d8cd98f00b204e9800998ecf8427e  ./DIR_7/DIR_6/DIR_3/__head_00000367__3
\2a0e6600c78bafd5adcfdb3c406c74fd  ./DIR_7/DIR_6/DIR_3/rbd\\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3
\b5cfa9d6c8febd618f91ac2843d50a1c  ./DIR_7/DIR_6/DIR_3/rbd\\udata.69c9916fa90431.0000000000001001__head_CC598367__3
\40192c0d629399d48c4ea150d5cdfefe  ./DIR_7/DIR_6/DIR_B/rbd\\udata.674340385edcf2.0000000000000608__head_ABD14B67__3
\b75539f7512afd713f28279006f6af34  ./DIR_7/DIR_6/DIR_B/rbd\\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3
\b5cfa9d6c8febd618f91ac2843d50a1c  ./DIR_7/DIR_6/DIR_B/rbd\\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3
\b5cfa9d6c8febd618f91ac2843d50a1c  ./DIR_7/DIR_6/DIR_B/rbd\\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3
/var/lib/ceph/osd/ceph-35/current/3.367_head$

/var/lib/ceph/osd/ceph-35/current/3.367_head$ ls -R
.:
DIR_7

./DIR_7:
DIR_6

./DIR_7/DIR_6:
DIR_3  DIR_B

./DIR_7/DIR_6/DIR_3:
__head_00000367__3  rbd\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3  rbd\udata.69c9916fa90431.0000000000001001__head_CC598367__3

./DIR_7/DIR_6/DIR_B:
rbd\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3  rbd\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3
rbd\udata.674340385edcf2.0000000000000608__head_ABD14B67__3  rbd\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3
/var/lib/ceph/osd/ceph-35/current/3.367_head$

============================================== OSD 63 =======================================================
/var/lib/ceph/osd/ceph-63/current/3.367_head$ find . -type f -exec md5sum {} \;
d41d8cd98f00b204e9800998ecf8427e  ./__head_00000367__3
\b5cfa9d6c8febd618f91ac2843d50a1c  ./rbd\\udata.69c9916fa90431.0000000000001001__head_CC598367__3
\2a0e6600c78bafd5adcfdb3c406c74fd  ./rbd\\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3
\b5cfa9d6c8febd618f91ac2843d50a1c  ./rbd\\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3
\40192c0d629399d48c4ea150d5cdfefe  ./rbd\\udata.674340385edcf2.0000000000000608__head_ABD14B67__3
\b75539f7512afd713f28279006f6af34  ./rbd\\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3
\b5cfa9d6c8febd618f91ac2843d50a1c  ./rbd\\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3
/var/lib/ceph/osd/ceph-63/current/3.367_head$

/var/lib/ceph/osd/ceph-63/current/3.367_head$ ls -R
.:
__head_00000367__3
rbd\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3
rbd\udata.674340385edcf2.0000000000000608__head_ABD14B67__3
rbd\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3
rbd\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3
rbd\udata.69c9916fa90431.0000000000001001__head_CC598367__3
rbd\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3
/var/lib/ceph/osd/ceph-63/current/3.367_head$

============================================== OSD 68 =======================================================

/var/lib/ceph/osd/ceph-68/current/3.367_head$ ls -R
.:
__head_00000367__3
/var/lib/ceph/osd/ceph-68/current/3.367_head$


============================================== OSD 14 =======================================================

/var/lib/ceph/osd/ceph-14/current/3.367_head$ ls -R
.:
__head_00000367__3
/var/lib/ceph/osd/ceph-14/current/3.367_head$

============================================== OSD 17 =======================================================
$ ls /var/lib/ceph/osd/ceph-17/current/3.367*
ls: cannot access /var/lib/ceph/osd/ceph-17/current/3.367*: No such file or directory
$


============================================== OSD 51 =======================================================
$ ls /var/lib/ceph/osd/ceph-51/current/3.367*
ls: cannot access /var/lib/ceph/osd/ceph-51/current/3.367*: No such file or directory
$

============================================== OSD 58 =======================================================
$ ls /var/lib/ceph/osd/ceph-58/current/3.367*
ls: cannot access /var/lib/ceph/osd/ceph-58/current/3.367*: No such file or directory
$

============================================== OSD 64 =======================================================
$ ls /var/lib/ceph/osd/ceph-64/current/3.367*
ls: cannot access /var/lib/ceph/osd/ceph-64/current/3.367*: No such file or directory
$ 

============================================== OSD 70 =======================================================
$ ls /var/lib/ceph/osd/ceph-70/current/3.367*
ls: cannot access /var/lib/ceph/osd/ceph-70/current/3.367*: No such file or directory
$

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux