Re: Reduced data availability: 4 pgs inactive, 4 pgs incomplete

"Brent Kennedy" <bkennedy@xxxxxxxxxx> · Fri, 5 Jan 2018 12:39:22 -0500

Unfortunately, this cluster was setup before the calculator was in place and when the equation was not well understood.  We have the storage space to move the pools and recreate them, which was apparently the only way to handle the issue( you are suggesting what appears to be a different approach ).  I was hoping to avoid doing all of this because the migration would be very time consuming.  There is no way to fix the stuck pg’s though?  If I were to expand the replication to 3 instances, would that help with the PGs per OSD issue any?  The math was originally based on 3 not the current 2.  Sounds like it may change to 300 max which may not be helpful…

When you say enforce, do you mean it will block all access to the cluster/OSDs?

-Brent

From: Janne Johansson [mailto:icepic.dz@xxxxxxxxx] 
Sent: Friday, January 5, 2018 3:34 AM
To: Brent Kennedy <bkennedy@xxxxxxxxxx>; ceph-users <ceph-users@xxxxxxxx>
Subject: Re:  Reduced data availability: 4 pgs inactive, 4 pgs incomplete

2018-01-05 6:56 GMT+01:00 Brent Kennedy <bkennedy@xxxxxxxxxx>:
We have upgraded from Hammer to Jewel and then Luminous 12.2.2 as of today.  During the hammer upgrade to Jewel we lost two host servers and let the cluster rebalance/recover, it ran out of space and stalled.  We then added three new host servers and then let the cluster rebalance/recover. During that process, at some point, we ended up with 4 pgs not being able to be repaired using “ceph pg repair xx.xx”.  I tried using ceph pg 11.720 query and from what I can tell the missing information matches, but is being blocked from being marked clean.  I keep seeing references to the ceph-object-store tool to use as an export/restore method, but I cannot find details on a step by step process given the current predicament.  It may also be possible for us to just lose the data if it cant be extracted so we can at least return the cluster to a healthy state.  Any thoughts?
Ceph –s output:

cluster:
    health: HEALTH_ERR
            Reduced data availability: 4 pgs inactive, 4 pgs incomplete
            Degraded data redundancy: 4 pgs unclean
            4 stuck requests are blocked > 4096 sec
            too many PGs per OSD (2549 > max 200)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This is the issue.
A temp workaround will be to bump the hard_ratio and perhaps restart the OSDs after (or add a ton of OSDs so the
PG/OSD gets below 200) 
In your case, the
    osd max pg per osd hard ratio
needs to go from 2.0 to 26.0 or above, which probably is rather crazy.

The thing is that Luminous 12.2.2 starts enforcing this which previous versions didn't (at least not in the same way).

Even if it is rather weird to run into this, you should have seen the warning before (even if it was > 300 previously) which also means
you should perhaps have considered not upgrading when the cluster wasn't HEALTH_OK if it was warning about huge amount of PGs
before going to 12.2.2.

-- 
May the most significant bit of your life be positive.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com