Hello, Thank you for your answer. indeed the min_size is 1: # ceph osd pool get volumes size size: 3 # ceph osd pool get volumes min_size min_size: 1 # I'm gonna try to find the mentioned discussions on the mailing lists, and read them. If you have a link at hand, that would be nice if you would send it to me. In the attached file you can see the contents of the directory containing PG data on the different OSDs (all that have appeared in the pg query). According to the md5sums the files are identical. What bothers me is the directory structure (you can see the ls -R in each dir that contains files). Where can I read about how/why those DIR# subdirectories have appeared? Given that the files themselves are identical on the "current" OSDs belonging to the PG, and as the osd.63 (currently not belonging to the PG) has the same files, is it safe to stop the OSD.2, remove the 3.367_head dir, and then restart the OSD? (all these with the noout flag set of course) Kind regards, Laszlo On 11.03.2017 00:32, Brad Hubbard wrote:
So this is why it happened I guess. pool 3 'volumes' replicated size 3 min_size 1 min_size = 1 is a recipe for disasters like this and there are plenty of ML threads about not setting it below 2. The past intervals in the pg query show several intervals where a single OSD may have gone rw. How important is this data? I would suggest checking which of these OSDs actually have the data for this pg. From the pg query it looks like 2, 35 and 68 and possibly 28 since it's the primary. Check all OSDs in the pg query output. I would then back up all copies and work out which copy, if any, you want to keep and then attempt something like the following. https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg17820.html If you want to abandon the pg see http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-September/012778.html for a possible solution. http://ceph.com/community/incomplete-pgs-oh-my/ may also give some ideas. On Fri, Mar 10, 2017 at 9:44 PM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx> wrote:The OSDs are all there. $ sudo ceph osd stat osdmap e60609: 72 osds: 72 up, 72 in an I have attached the result of ceph osd tree, and ceph osd dump commands. I got some extra info about the network problem. A faulty network device has flooded the network eating up all the bandwidth so the OSDs were not able to properly communicate with each other. This has lasted for almost 1 day. Thank you, Laszlo On 10.03.2017 12:19, Brad Hubbard wrote:To me it looks like someone may have done an "rm" on these OSDs but not removed them from the crushmap. This does not happen automatically. Do these OSDs show up in "ceph osd tree" and "ceph osd dump" ? If so, paste the output. Without knowing what exactly happened here it may be difficult to work out how to proceed. In order to go clean the primary needs to communicate with multiple OSDs, some of which are marked DNE and seem to be uncontactable. This seems to be more than a network issue (unless the outage is still happening). http://docs.ceph.com/docs/master/rados/operations/pg-states/?highlight=incomplete On Fri, Mar 10, 2017 at 6:09 PM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx> wrote:Hello, I was informed that due to a networking issue the ceph cluster network was affected. There was a huge packet loss, and network interfaces were flipping. That's all I got. This outage has lasted a longer period of time. So I assume that some OSD may have been considered dead and the data from them has been moved away to other PGs (this is what ceph is supposed to do if I'm correct). Probably that was the point when the listed PGs have appeared into the picture. From the query we can see this for one of those OSDs: { "peer": "14", "pgid": "3.367", "last_update": "0'0", "last_complete": "0'0", "log_tail": "0'0", "last_user_version": 0, "last_backfill": "MAX", "purged_snaps": "[]", "history": { "epoch_created": 4, "last_epoch_started": 54899, "last_epoch_clean": 55143, "last_epoch_split": 0, "same_up_since": 60603, "same_interval_since": 60603, "same_primary_since": 60593, "last_scrub": "2852'33528", "last_scrub_stamp": "2017-02-26 02:36:55.210150", "last_deep_scrub": "2852'16480", "last_deep_scrub_stamp": "2017-02-21 00:14:08.866448", "last_clean_scrub_stamp": "2017-02-26 02:36:55.210150" }, "stats": { "version": "0'0", "reported_seq": "14", "reported_epoch": "59779", "state": "down+peering", "last_fresh": "2017-02-27 16:30:16.230519", "last_change": "2017-02-27 16:30:15.267995", "last_active": "0.000000", "last_peered": "0.000000", "last_clean": "0.000000", "last_became_active": "0.000000", "last_became_peered": "0.000000", "last_unstale": "2017-02-27 16:30:16.230519", "last_undegraded": "2017-02-27 16:30:16.230519", "last_fullsized": "2017-02-27 16:30:16.230519", "mapping_epoch": 60601, "log_start": "0'0", "ondisk_log_start": "0'0", "created": 4, "last_epoch_clean": 55143, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "2852'33528", "last_scrub_stamp": "2017-02-26 02:36:55.210150", "last_deep_scrub": "2852'16480", "last_deep_scrub_stamp": "2017-02-21 00:14:08.866448", "last_clean_scrub_stamp": "2017-02-26 02:36:55.210150", "log_size": 0, "ondisk_log_size": 0, "stats_invalid": "0", "stat_sum": { "num_bytes": 0, "num_objects": 0, "num_object_clones": 0, "num_object_copies": 0, "num_objects_missing_on_primary": 0, "num_objects_degraded": 0, "num_objects_misplaced": 0, "num_objects_unfound": 0, "num_objects_dirty": 0, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 0, "num_write_kb": 0, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0 }, "up": [ 28, 35, 2 ], "acting": [ 28, 35, 2 ], "blocked_by": [], "up_primary": 28, "acting_primary": 28 }, "empty": 1, "dne": 0, "incomplete": 0, "last_epoch_started": 0, "hit_set_history": { "current_last_update": "0'0", "current_last_stamp": "0.000000", "current_info": { "begin": "0.000000", "end": "0.000000", "version": "0'0", "using_gmt": "1" }, "history": [] } }, Where can I read more about the meaning of each parameter, some of them have quite self explanatory names, but not all (or probably we need a deeper knowledge to understand them). Isn't there any parameter that would say when was that OSD assigned to the given PG? Also the stat_sum shows 0 for all its parameters. Why is it blocking then? Is there a way to tell the PG to forget about that OSD? Thank you, Laszlo On 10.03.2017 03:05, Brad Hubbard wrote:Can you explain more about what happened? The query shows progress is blocked by the following OSDs. "blocked_by": [ 14, 17, 51, 58, 63, 64, 68, 70 ], Some of these OSDs are marked as "dne" (Does Not Exist). peer": "17", "dne": 1, "peer": "51", "dne": 1, "peer": "58", "dne": 1, "peer": "64", "dne": 1, "peer": "70", "dne": 1, Can we get a complete background here please? On Thu, Mar 9, 2017 at 10:53 PM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx> wrote:Hello, After a major network outage our ceph cluster ended up with an inactive PG: # ceph health detail HEALTH_WARN 1 pgs incomplete; 1 pgs stuck inactive; 1 pgs stuck unclean; 1 requests are blocked > 32 sec; 1 osds have slow requests pg 3.367 is stuck inactive for 912263.766607, current state incomplete, last acting [28,35,2] pg 3.367 is stuck unclean for 912263.766688, current state incomplete, last acting [28,35,2] pg 3.367 is incomplete, acting [28,35,2] 1 ops are blocked > 268435 sec 1 ops are blocked > 268435 sec on osd.28 1 osds have slow requests # ceph -s cluster 6713d1b8-83da-11e6-aa79-525400d98c5a health HEALTH_WARN 1 pgs incomplete 1 pgs stuck inactive 1 pgs stuck unclean 1 requests are blocked > 32 sec monmap e3: 3 mons at {tv-dl360-1=10.12.193.73:6789/0,tv-dl360-2=10.12.193.74:6789/0,tv-dl360-3=10.12.193.75:6789/0} election epoch 72, quorum 0,1,2 tv-dl360-1,tv-dl360-2,tv-dl360-3 osdmap e60609: 72 osds: 72 up, 72 in pgmap v3670252: 4864 pgs, 11 pools, 134 GB data, 23778 objects 490 GB used, 130 TB / 130 TB avail 4863 active+clean 1 incomplete client io 0 B/s rd, 38465 B/s wr, 2 op/s ceph pg repair doesn't change anything. What should I try to recover it? Attached is the result of ceph pg query on the problem PG. Thank you, Laszlo _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
============================================== OSD 2 ======================================================= /var/lib/ceph/osd/ceph-2/current/3.367_head$ find . -type f -exec md5sum {} \; d41d8cd98f00b204e9800998ecf8427e ./__head_00000367__3 \b5cfa9d6c8febd618f91ac2843d50a1c ./rbd\\udata.69c9916fa90431.0000000000001001__head_CC598367__3 \2a0e6600c78bafd5adcfdb3c406c74fd ./rbd\\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3 \b5cfa9d6c8febd618f91ac2843d50a1c ./rbd\\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3 \40192c0d629399d48c4ea150d5cdfefe ./rbd\\udata.674340385edcf2.0000000000000608__head_ABD14B67__3 \b75539f7512afd713f28279006f6af34 ./rbd\\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3 \b5cfa9d6c8febd618f91ac2843d50a1c ./rbd\\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3 /var/lib/ceph/osd/ceph-2/current/3.367_head$ /var/lib/ceph/osd/ceph-2/current/3.367_head$ ls -R .: __head_00000367__3 rbd\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3 rbd\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3 rbd\udata.69c9916fa90431.0000000000001001__head_CC598367__3 rbd\udata.674340385edcf2.0000000000000608__head_ABD14B67__3 rbd\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3 rbd\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3 /var/lib/ceph/osd/ceph-2/current/3.367_head$ ============================================== OSD 28 ======================================================= /var/lib/ceph/osd/ceph-28/current/3.367_head$ find . -type f -exec md5sum {} \; d41d8cd98f00b204e9800998ecf8427e ./DIR_7/DIR_6/DIR_3/__head_00000367__3 \2a0e6600c78bafd5adcfdb3c406c74fd ./DIR_7/DIR_6/DIR_3/rbd\\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3 \b5cfa9d6c8febd618f91ac2843d50a1c ./DIR_7/DIR_6/DIR_3/rbd\\udata.69c9916fa90431.0000000000001001__head_CC598367__3 \40192c0d629399d48c4ea150d5cdfefe ./DIR_7/DIR_6/DIR_B/rbd\\udata.674340385edcf2.0000000000000608__head_ABD14B67__3 \b75539f7512afd713f28279006f6af34 ./DIR_7/DIR_6/DIR_B/rbd\\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3 \b5cfa9d6c8febd618f91ac2843d50a1c ./DIR_7/DIR_6/DIR_B/rbd\\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3 \b5cfa9d6c8febd618f91ac2843d50a1c ./DIR_7/DIR_6/DIR_B/rbd\\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3 /var/lib/ceph/osd/ceph-28/current/3.367_head$ /var/lib/ceph/osd/ceph-28/current/3.367_head$ ls -R .: DIR_7 ./DIR_7: DIR_6 ./DIR_7/DIR_6: DIR_3 DIR_B ./DIR_7/DIR_6/DIR_3: __head_00000367__3 rbd\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3 rbd\udata.69c9916fa90431.0000000000001001__head_CC598367__3 ./DIR_7/DIR_6/DIR_B: rbd\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3 rbd\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3 rbd\udata.674340385edcf2.0000000000000608__head_ABD14B67__3 rbd\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3 /var/lib/ceph/osd/ceph-28/current/3.367_head$ ============================================== OSD 35 ======================================================= /var/lib/ceph/osd/ceph-35/current/3.367_head$ find . -type f -exec md5sum {} \; d41d8cd98f00b204e9800998ecf8427e ./DIR_7/DIR_6/DIR_3/__head_00000367__3 \2a0e6600c78bafd5adcfdb3c406c74fd ./DIR_7/DIR_6/DIR_3/rbd\\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3 \b5cfa9d6c8febd618f91ac2843d50a1c ./DIR_7/DIR_6/DIR_3/rbd\\udata.69c9916fa90431.0000000000001001__head_CC598367__3 \40192c0d629399d48c4ea150d5cdfefe ./DIR_7/DIR_6/DIR_B/rbd\\udata.674340385edcf2.0000000000000608__head_ABD14B67__3 \b75539f7512afd713f28279006f6af34 ./DIR_7/DIR_6/DIR_B/rbd\\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3 \b5cfa9d6c8febd618f91ac2843d50a1c ./DIR_7/DIR_6/DIR_B/rbd\\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3 \b5cfa9d6c8febd618f91ac2843d50a1c ./DIR_7/DIR_6/DIR_B/rbd\\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3 /var/lib/ceph/osd/ceph-35/current/3.367_head$ /var/lib/ceph/osd/ceph-35/current/3.367_head$ ls -R .: DIR_7 ./DIR_7: DIR_6 ./DIR_7/DIR_6: DIR_3 DIR_B ./DIR_7/DIR_6/DIR_3: __head_00000367__3 rbd\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3 rbd\udata.69c9916fa90431.0000000000001001__head_CC598367__3 ./DIR_7/DIR_6/DIR_B: rbd\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3 rbd\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3 rbd\udata.674340385edcf2.0000000000000608__head_ABD14B67__3 rbd\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3 /var/lib/ceph/osd/ceph-35/current/3.367_head$ ============================================== OSD 63 ======================================================= /var/lib/ceph/osd/ceph-63/current/3.367_head$ find . -type f -exec md5sum {} \; d41d8cd98f00b204e9800998ecf8427e ./__head_00000367__3 \b5cfa9d6c8febd618f91ac2843d50a1c ./rbd\\udata.69c9916fa90431.0000000000001001__head_CC598367__3 \2a0e6600c78bafd5adcfdb3c406c74fd ./rbd\\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3 \b5cfa9d6c8febd618f91ac2843d50a1c ./rbd\\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3 \40192c0d629399d48c4ea150d5cdfefe ./rbd\\udata.674340385edcf2.0000000000000608__head_ABD14B67__3 \b75539f7512afd713f28279006f6af34 ./rbd\\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3 \b5cfa9d6c8febd618f91ac2843d50a1c ./rbd\\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3 /var/lib/ceph/osd/ceph-63/current/3.367_head$ /var/lib/ceph/osd/ceph-63/current/3.367_head$ ls -R .: __head_00000367__3 rbd\udata.47267a7a3aa9f0.00000000000004f6__head_2E024B67__3 rbd\udata.674340385edcf2.0000000000000608__head_ABD14B67__3 rbd\udata.69c9701a2dc408.0000000000001a01__head_97F0AB67__3 rbd\udata.69c98b6ce01687.0000000000000009__head_42FFF367__3 rbd\udata.69c9916fa90431.0000000000001001__head_CC598367__3 rbd\udata.69c9ac9348fd.0000000000000a05__head_032B0B67__3 /var/lib/ceph/osd/ceph-63/current/3.367_head$ ============================================== OSD 68 ======================================================= /var/lib/ceph/osd/ceph-68/current/3.367_head$ ls -R .: __head_00000367__3 /var/lib/ceph/osd/ceph-68/current/3.367_head$ ============================================== OSD 14 ======================================================= /var/lib/ceph/osd/ceph-14/current/3.367_head$ ls -R .: __head_00000367__3 /var/lib/ceph/osd/ceph-14/current/3.367_head$ ============================================== OSD 17 ======================================================= $ ls /var/lib/ceph/osd/ceph-17/current/3.367* ls: cannot access /var/lib/ceph/osd/ceph-17/current/3.367*: No such file or directory $ ============================================== OSD 51 ======================================================= $ ls /var/lib/ceph/osd/ceph-51/current/3.367* ls: cannot access /var/lib/ceph/osd/ceph-51/current/3.367*: No such file or directory $ ============================================== OSD 58 ======================================================= $ ls /var/lib/ceph/osd/ceph-58/current/3.367* ls: cannot access /var/lib/ceph/osd/ceph-58/current/3.367*: No such file or directory $ ============================================== OSD 64 ======================================================= $ ls /var/lib/ceph/osd/ceph-64/current/3.367* ls: cannot access /var/lib/ceph/osd/ceph-64/current/3.367*: No such file or directory $ ============================================== OSD 70 ======================================================= $ ls /var/lib/ceph/osd/ceph-70/current/3.367* ls: cannot access /var/lib/ceph/osd/ceph-70/current/3.367*: No such file or directory $
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com