Re: Reduced data availability: 4 pgs inactive, 4 pgs incomplete

"Brent Kennedy" <bkennedy@xxxxxxxxxx> · Fri, 5 Jan 2018 16:31:20 -0500

-----Original Message-----
From: Stefan Kooman [mailto:stefan@xxxxxx] 
Sent: Friday, January 5, 2018 2:58 PM
To: Brent Kennedy <bkennedy@xxxxxxxxxx>
Cc: 'Janne Johansson' <icepic.dz@xxxxxxxxx>; 'ceph-users' <ceph-users@xxxxxxxx>
Subject: Re:  Reduced data availability: 4 pgs inactive, 4 pgs incomplete

Quoting Brent Kennedy (bkennedy@xxxxxxxxxx):
> Unfortunately, this cluster was setup before the calculator was in 
> place and when the equation was not well understood.  We have the 
> storage space to move the pools and recreate them, which was 
> apparently the only way to handle the issue( you are suggesting what 
> appears to be a different approach ).  I was hoping to avoid doing all 
> of this because the migration would be very time consuming.  There is 
> no way to fix the stuck pg’s though?  If I were to expand the 
> replication to 3 instances, would that help with the PGs per OSD issue 
> any?
No! It will make the problem worse because you need PGs to host these copies. The more replicas, the more PGs you need.
Guess I am confused here, wouldn’t it spread out the existing data to more PGs?  Or are you saying that it couldn’t spread out because the PGs are already in use?  Previously it was set to 3 and we reduced it to 2 because of failures.

> The math was originally based on 3 not the current 2.  Sounds like it 
> may change to 300 max which may not be helpful…

> When you say enforce, do you mean it will block all access to the cluster/OSDs?

No, you will not be able to increase the number of PGs on the pool.
I had no intention of doing so, but it sounds like you are saying is that "enforcing" means it wont allow additional PGs.   That makes sense.

> 
> We have upgraded from Hammer to Jewel and then Luminous 12.2.2 as of 
> today.  During the hammer upgrade to Jewel we lost two host servers 
> and let the cluster rebalance/recover, it ran out of space and 
> stalled.  We then added three new host servers and then let the 
> cluster rebalance/recover. During that process, at some point, we 
> ended up with 4 pgs not being able to be repaired using “ceph pg 
> repair xx.xx”.  I tried using ceph pg 11.720 query and from what I can 
> tell the missing information matches, but is being blocked from being 
> marked clean.  I keep seeing references to the ceph-object-store tool 
> to use as an export/restore method, but I cannot find details on a 
> step by step process given the current predicament.  It may also be 
> possible for us to just lose the data if it cant be extracted so we 
> can at least return the cluster to a healthy state.  Any thoughts?

What is the output of "ceph daemon osd.$ID config show | grep osd_allow_recovery_below_min_size

The output was "true" for all OSDs in the cluster, so should be in the clear there.

If you are below min_size recovery will not complete when that setting is not true. Maybe this thread is interesting:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-October/005613.html

Especially when a OSD is candidate for backfilling a target but does not contain any data.

Gr. Stefan

So perhaps the only way to deal with this would be creating a new pool and migrating all the data to the new pool?  We are using radosgw, so that complicates the process a bit as I believe it requires certain pools to be in place.  I was trying to find a list of the default pools that could be deleted, mainly because some of those pools are set for a high amount of PGs as well, deleting them should help.

Pools List:
Name			PG status		Usage		Activity
data			512 active+clean		0 / 45.5T	0 rd, 0 wr
rbd			512 active+clean		0 / 45.5T	0 rd, 0 wr
.rgw.root		4096 active+clean		1.01k / 45.5T	0 rd, 0 wr
.rgw.control		4096 active+clean		0 / 45.5T	0 rd, 0 wr
.rgw			4096 active+clean		9.48k / 45.5T	0 rd, 0 wr
.rgw.gc			4096 active+clean		0 / 45.5T	0 rd, 0 wr
.users.uid		4096 active+clean		961 / 45.5T	0 rd, 0 wr
.users			4096 active+clean		33 / 45.5T	0 rd, 0 wr
.rgw.buckets.index	4096 active+clean		0 / 45.5T	0 rd, 0 wr
.rgw.buckets		4092 active+clean, 4 incomplete	11.2T / 68.3T	0 rd, 0 wr
.users.email		4096 active+clean		0 / 45.5T	0 rd, 0 wr
.log			16 active+clean			0 / 45.5T	0 rd, 0 wr

-Brent

-- 
| BIT BV  http://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / info@xxxxxx

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com