Re: Incomplete pgs and no data movement ( cluster appears readonly )

Gregory Farnum <gfarnum@xxxxxxxxxx> · Wed, 10 Jan 2018 12:15:28 -0800

On Wed, Jan 10, 2018 at 11:14 AM, Brent Kennedy <bkennedy@xxxxxxxxxx> wrote:
> I adjusted “osd max pg per osd hard ratio ” to 50.0 and left “mon max pg per
> osd” at 5000 just to see if things would allow data movement.  This worked,
> the new pool I created finished its creation and spread out.  I was able to
> then copy the data from the existing pool into the new pool and delete the
> old one.
>
>
>
> Used this process for copying the default pools:
>
> ceph osd pool create .users.email.new 16
>
> rados cppool .users.email .users.email.new
>
> ceph osd pool delete .users.email .users.email --yes-i-really-really-mean-it
>
> ceph osd pool rename .users.email.new .users.email
>
> ceph osd pool application enable .users.email rgw
>
>
>
>
>
> So at this point, I have recreated all the .rgw and .user pools except
> .rgw.buckets with a pg_num of 16, which significantly reduced the pgs,
> unfortunately, the incompletes are still there:
>
>
>
>   cluster:
>
>    health: HEALTH_WARN
>
>             Reduced data availability: 4 pgs inactive, 4 pgs incomplete
>
>             Degraded data redundancy: 4 pgs unclean

There seems to have been some confusion here. From your prior thread:

On Thu, Jan 4, 2018 at 9:56 PM, Brent Kennedy <bkennedy@xxxxxxxxxx> wrote:
> We have upgraded from Hammer to Jewel and then Luminous 12.2.2 as of today.
> During the hammer upgrade to Jewel we lost two host servers

So, if you have size two, and you lose two servers before the data has
finished recovering...you've lost data. And that is indeed what
"incomplete" means: the PG thinks writes may have happened, but the
OSDs which held the data at that time aren't available. You'll need to
dive into doing PG recovery with the ceph-objectstore tool and things,
or find one of the groups that does consulting around recovery.
-Greg

>
>
>
>   services:
>
>     mon: 3 daemons, quorum mon1,mon2,mon3
>
>     mgr: mon3(active), standbys: mon1, mon2
>
>     osd: 43 osds: 43 up, 43 in
>
>
>
>   data:
>
>     pools:   10 pools, 4240 pgs
>
>     objects: 8148k objects, 10486 GB
>
>     usage:   21536 GB used, 135 TB / 156 TB avail
>
>     pgs:     0.094% pgs not active
>
>              4236 active+clean
>
>              4    incomplete
>
>
>
> The health page is showing blue instead of read on the donut chart, at one
> point it jumped to green but its back to blue currently.  There are no more
> ops blocked/delayed either.
>
>
>
> Thanks for assistance, it seems the cluster will play nice now.  Any
> thoughts on the stuck pgs?  I ran a query on 11.720 and it shows:
>
> "blocked_by": [
>
>                 13,
>
>                 27,
>
>                 28
>
>
>
> OSD 13 was acting strange so I wiped it and removed it from the cluster.
> This was during the rebuild so I wasn’t aware of it blocking.  Now I am
> trying to figure out how a removed OSD is blocking.  I went through the
> process to remove it:
>
> ceph osd crush remove
>
> ceph auth del
>
> ceph osd rm
>
>
>
> I guess since the cluster was a hot mess at that point, its possible it was
> borked and therefore the pg is borked.  I am trying to avoid deleting the
> data as there is data in the OSDs that are online.
>
>
>
> -Brent
>
>
>
>
>
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
> Brent Kennedy
> Sent: Wednesday, January 10, 2018 12:20 PM
> To: 'Janne Johansson' <icepic.dz@xxxxxxxxx>
> Cc: ceph-users@xxxxxxxxxxxxxx
> Subject: Re:  Incomplete pgs and no data movement ( cluster
> appears readonly )
>
>
>
> I change “mon max pg per osd” to 5000 because when I changed it to zero,
> which was supposed to disable it, it caused an issue where I couldn’t create
> any pools.  It would say 0 was larger than the minimum.  I imagine that’s a
> bug, if I wanted it disabled, then it shouldn’t use the calculation.  I then
> set “osd max pg per osd hard ratio ” to 5 after changing “mon max pg per
> osd” to 5000, figuring 5*5000 would cover it.  Perhaps not.  I will adjust
> it to 30 and restart the OSDs.
>
>
>
> -Brent
>
>
>
>
>
>
>
> From: Janne Johansson [mailto:icepic.dz@xxxxxxxxx]
> Sent: Wednesday, January 10, 2018 3:00 AM
> To: Brent Kennedy <bkennedy@xxxxxxxxxx>
> Cc: ceph-users@xxxxxxxxxxxxxx
> Subject: Re:  Incomplete pgs and no data movement ( cluster
> appears readonly )
>
>
>
>
>
>
>
> 2018-01-10 8:51 GMT+01:00 Brent Kennedy <bkennedy@xxxxxxxxxx>:
>
> As per a previous thread, my pgs are set too high.  I tried adjusting the
> “mon max pg per osd” up higher and higher, which did clear the
> error(restarted monitors and managers each time), but it seems that data
> simply wont move around the cluster.  If I stop the primary OSD of an
> incomplete pg, the cluster just shows those affected pages as
> active+undersized+degraded:
>
>
>
> I also adjusted “osd max pg per osd hard ratio ” to 5, but that didn’t seem
> to trigger any data moved.  I did restart the OSDs each time I changed it.
> The data just wont finish moving.  “ceph –w” shows this:
>
> 2018-01-10 07:49:27.715163 osd.20 [WRN] slow request 960.675164 seconds old,
> received at 2018-01-10 07:33:27.039907: osd_op(client.3542508.0:4097 14.0
> 14.50e8d0b0 (undecoded) ondisk+write+known_if_redirected e125984) currently
> queued_for_pg
>
>
>
>
>
> Did you bump the ratio so that the PGs per OSD max * hard ratio actually
> became more than the amount of PGs you had?
>
> Last time you mailed the ratio was 25xx and the max was 200 which meant the
> ratio would have needed to be far more than 5.0.
>
>
>
>
>
> --
>
> May the most significant bit of your life be positive.
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com