Incomplete PGs

Aaron Bassett <aaron@xxxxxxxxxxxxxxxxx> · Mon, 1 Dec 2014 20:12:22 -0500

Hi all, I have a problem with some incomplete pgs. Here’s the backstory: I had a pool that I had accidently left with a size of 2. On one of the ods nodes, the system hdd started to fail and I attempted to rescue it by sacrificing one of my osd nodes. That went ok and I was able to bring the node back up minus the one osd. Now I have 11 incomplete osds. I believe these are mostly from the pool that only had size two, but I cant tell for sure. I found another thread on here that talked about using ceph_objectstore_tool to add or remove pg data to get out of an incomplete state. 

Let’s start with the one pg I’ve been playing with, this is a loose description of where I’ve been. First I saw that it had the missing osd in “down_osds_we_would_probe” when I queried it, and some reading around that told me to recreate the missing osd, so I did that. It (obviously) didnt have the missing data, but it took the pg from down+incomplete to just incomplete. Then I tried pg_force_create and that didnt seem to make a difference. Some more googling then brought me to ceph_objectstore_tool and I started to take a closer look at the results from pg query. I noticed that the list of probing osds gets longer and longer till the end of the query has something like:

 "probing_osds": [
       "0",
       "3",
       "4",
       "16",
       "23",
       "26",
       "35",
       "41",
       "44",
       "51",
       "56”],

So I took a look at those osds and noticed that some of them have data in the directory for the troublesome pg and others dont. So I tried picking one with the *most* data and i used ceph_objectstore_tool to export the pg. It was > 6G so a fair amount of data is still there. I then imported it (after removing) into all the others in that list. Unfortunately, it is still incomplete. I’m not sure what my next step should be here. Here’s some other stuff from the query on it:

"info": { "pgid": "0.63b",
    "last_update": "50495'8246",
    "last_complete": "50495'8246",
    "log_tail": "20346'5245",
    "last_user_version": 8246,
    "last_backfill": "MAX",
    "purged_snaps": "[]",
    "history": { "epoch_created": 1,
        "last_epoch_started": 51102,
        "last_epoch_clean": 50495,
        "last_epoch_split": 0,
        "same_up_since": 68312,
        "same_interval_since": 68312,
        "same_primary_since": 68190,
        "last_scrub": "28158'8240",
        "last_scrub_stamp": "2014-11-18 17:08:49.368486",
        "last_deep_scrub": "28158'8240",
        "last_deep_scrub_stamp": "2014-11-18 17:08:49.368486",
        "last_clean_scrub_stamp": "2014-11-18 17:08:49.368486"},
    "stats": { "version": "50495'8246",
        "reported_seq": "84279",
        "reported_epoch": "69394",
        "state": "down+incomplete",
        "last_fresh": "2014-12-01 23:23:07.355308",
        "last_change": "2014-12-01 21:28:52.771807",
        "last_active": "2014-11-24 13:37:09.784417",
        "last_clean": "2014-11-22 21:59:49.821836",
        "last_became_active": "0.000000",
        "last_unstale": "2014-12-01 23:23:07.355308",
        "last_undegraded": "2014-12-01 23:23:07.355308",
        "last_fullsized": "2014-12-01 23:23:07.355308",
        "mapping_epoch": 68285,
        "log_start": "20346'5245",
        "ondisk_log_start": "20346'5245",
        "created": 1,
        "last_epoch_clean": 50495,
        "parent": "0.0",
        "parent_split_bits": 0,
        "last_scrub": "28158'8240",
        "last_scrub_stamp": "2014-11-18 17:08:49.368486",
        "last_deep_scrub": "28158'8240",
        "last_deep_scrub_stamp": "2014-11-18 17:08:49.368486",
        "last_clean_scrub_stamp": "2014-11-18 17:08:49.368486",
        "log_size": 3001,
        "ondisk_log_size": 3001,

Also in the peering section, all the peers now have the same last_update: which makes me think it should just pick up and take off. 

There is another think I’m having problems with and I’m not sure if it’s related or not. I set a crush map manually as I have a mix of ssd and platter osds and it seems to work when I set it, the cluster starts rebalancing, etc, but if I do a restart ceph-all on all my nodes the crush maps seems to revert to the one I didn’t set. I don’t know if its being blocked from taking by these incomplete pgs or if I’m missing a step to get it to “stick” It makes me think when I’m stopping and starting these osds to use ceph_objectstore_tool on them they may be getting out of sync with the cluster.

Any insights would be greatly appreciated,

Aaron 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com