Re: Incomplete pgs and no data movement ( cluster appears readonly )

"Brent Kennedy" <bkennedy@xxxxxxxxxx> · Wed, 10 Jan 2018 14:14:11 -0500

I adjusted “osd max pg per osd hard ratio ” to 50.0 and left “mon max pg per osd” at 5000 just to see if things would allow data movement.  This worked, the new pool I created finished its creation and spread out.  I was able to then copy the data from the existing pool into the new pool and delete the old one.  

Used this process for copying the default pools:
ceph osd pool create .users.email.new 16
rados cppool .users.email .users.email.new
ceph osd pool delete .users.email .users.email --yes-i-really-really-mean-it
ceph osd pool rename .users.email.new .users.email
ceph osd pool application enable .users.email rgw

So at this point, I have recreated all the .rgw and .user pools except .rgw.buckets with a pg_num of 16, which significantly reduced the pgs, unfortunately, the incompletes are still there:

  cluster:
   health: HEALTH_WARN
            Reduced data availability: 4 pgs inactive, 4 pgs incomplete
            Degraded data redundancy: 4 pgs unclean

  services:
    mon: 3 daemons, quorum mon1,mon2,mon3
    mgr: mon3(active), standbys: mon1, mon2
    osd: 43 osds: 43 up, 43 in

  data:
    pools:   10 pools, 4240 pgs
    objects: 8148k objects, 10486 GB
    usage:   21536 GB used, 135 TB / 156 TB avail
    pgs:     0.094% pgs not active
             4236 active+clean
             4    incomplete

The health page is showing blue instead of read on the donut chart, at one point it jumped to green but its back to blue currently.  There are no more ops blocked/delayed either. 

Thanks for assistance, it seems the cluster will play nice now.  Any thoughts on the stuck pgs?  I ran a query on 11.720 and it shows:
"blocked_by": [
                13,
                27,
                28

OSD 13 was acting strange so I wiped it and removed it from the cluster.  This was during the rebuild so I wasn’t aware of it blocking.  Now I am trying to figure out how a removed OSD is blocking.  I went through the process to remove it:
ceph osd crush remove
ceph auth del
ceph osd rm

I guess since the cluster was a hot mess at that point, its possible it was borked and therefore the pg is borked.  I am trying to avoid deleting the data as there is data in the OSDs that are online.

-Brent

From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Brent Kennedy
Sent: Wednesday, January 10, 2018 12:20 PM
To: 'Janne Johansson' <icepic.dz@xxxxxxxxx>
Cc: ceph-users@xxxxxxxxxxxxxx
Subject: Re:  Incomplete pgs and no data movement ( cluster appears readonly )

I change “mon max pg per osd” to 5000 because when I changed it to zero, which was supposed to disable it, it caused an issue where I couldn’t create any pools.  It would say 0 was larger than the minimum.  I imagine that’s a bug, if I wanted it disabled, then it shouldn’t use the calculation.  I then set “osd max pg per osd hard ratio ” to 5 after changing “mon max pg per osd” to 5000, figuring 5*5000 would cover it.  Perhaps not.  I will adjust it to 30 and restart the OSDs.

-Brent

From: Janne Johansson [mailto:icepic.dz@xxxxxxxxx] 
Sent: Wednesday, January 10, 2018 3:00 AM
To: Brent Kennedy <bkennedy@xxxxxxxxxx>
Cc: ceph-users@xxxxxxxxxxxxxx
Subject: Re:  Incomplete pgs and no data movement ( cluster appears readonly )

2018-01-10 8:51 GMT+01:00 Brent Kennedy <bkennedy@xxxxxxxxxx>:
As per a previous thread, my pgs are set too high.  I tried adjusting the “mon max pg per osd” up higher and higher, which did clear the error(restarted monitors and managers each time), but it seems that data simply wont move around the cluster.  If I stop the primary OSD of an incomplete pg, the cluster just shows those affected pages as active+undersized+degraded:

I also adjusted “osd max pg per osd hard ratio ” to 5, but that didn’t seem to trigger any data moved.  I did restart the OSDs each time I changed it.  The data just wont finish moving.  “ceph –w” shows this:
2018-01-10 07:49:27.715163 osd.20 [WRN] slow request 960.675164 seconds old, received at 2018-01-10 07:33:27.039907: osd_op(client.3542508.0:4097 14.0 14.50e8d0b0 (undecoded) ondisk+write+known_if_redirected e125984) currently queued_for_pg

Did you bump the ratio so that the PGs per OSD max * hard ratio actually became more than the amount of PGs you had?
Last time you mailed the ratio was 25xx and the max was 200 which meant the ratio would have needed to be far more than 5.0.

-- 
May the most significant bit of your life be positive.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com