You may also be able to use an upmap (or the upmap balancer) to help make room for you on the osd which is too full. Respectfully, *Wes Dillingham* wes@xxxxxxxxxxxxxxxxx LinkedIn <http://www.linkedin.com/in/wesleydillingham> On Fri, Nov 19, 2021 at 1:14 PM Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx> wrote: > Okay, now I see your attachment, the pg is in state: > > "state": > "active+undersized+degraded+remapped+inconsistent+backfill_toofull", > > The reason it cant scrub or repair is that its degraded and further it > seems that the cluster doesnt have the space to make that recovery happen > "backfill_toofull" state. This may clear on its own as other pgs recover > and this pg is ultimately able to recover. Other options are to remove data > or add capacity. How full is your cluster? Is your cluster currently > backfilling actively. > > Respectfully, > > *Wes Dillingham* > wes@xxxxxxxxxxxxxxxxx > LinkedIn <http://www.linkedin.com/in/wesleydillingham> > > > On Fri, Nov 19, 2021 at 10:57 AM J-P Methot <jp.methot@xxxxxxxxxxxxxxxxx> > wrote: > >> We have stopped deepscrubbing a while ago. However, forcing a deepscrub >> by doing "ceph pg deep-scrub 6.180" doesn't do anything. The deepscrub >> doesn't run at all. Could the deepscrubbing process be stuck elsewhere? >> On 11/18/21 3:29 PM, Wesley Dillingham wrote: >> >> That response is typically indicative of a pg whose OSD sets has changed >> since it was last scrubbed (typically from a disk failing). >> >> Are you sure its actually getting scrubbed when you issue the scrub? For >> example you can issue: "ceph pg <pg_id> query" and look for >> "last_deep_scrub_stamp" which will tell you when it was last deep >> scrubbed. >> >> Further, in sufficiently recent versions of Ceph (introduced in >> 14.2.something iirc) setting the flag "nodeep-scrub" will cause all in >> flight deep-scrubs to stop immediately. You may have a scheduling issue >> where you deep-scrub or repairs arent getting scheduled. >> >> Set the nodeep-scrub flag: "ceph osd set nodeep-scrub" and wait for all >> current deep-scrubs to complete then try and manually re-issue the deep >> scrub "ceph pg deep-scrub <pg_id>" at this point your scrub should start >> near immediately and "rados >> list-inconsistent-obj 6.180 --format=json-pretty" should return with >> something of value. >> >> Respectfully, >> >> *Wes Dillingham* >> wes@xxxxxxxxxxxxxxxxx >> LinkedIn <http://www.linkedin.com/in/wesleydillingham> >> >> >> On Thu, Nov 18, 2021 at 2:38 PM J-P Methot <jp.methot@xxxxxxxxxxxxxxxxx> >> wrote: >> >>> Hi, >>> >>> We currently have a PG stuck in an inconsistent state on an erasure >>> coded pool. The pool's K and M values are 33 and 3. The command rados >>> list-inconsistent-obj 6.180 --format=json-pretty results in the >>> following error: >>> >>> No scrub information available for pg 6.180 error 2: (2) No such file or >>> directory >>> >>> Forcing a deep scrub of the pg does not fix this. Doing a ceph pg repair >>> 6.180 doesn't seem to do anything. Is there a known bug explaining this >>> behavior? I am attaching informations regarding the PG in question. >>> >>> -- >>> Jean-Philippe Méthot >>> Senior Openstack system administrator >>> Administrateur système Openstack sénior >>> PlanetHoster inc. >>> >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> >> -- >> Jean-Philippe Méthot >> Senior Openstack system administrator >> Administrateur système Openstack sénior >> PlanetHoster inc. >> >> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx