Hi Josh, Thanks for your reply. But this I already tried that, with no luck. Primary OSD goes down and hangs forever, upon "mark_unfound_lost delete” command. I guess it is too damaged to salvage, unless one really starts deleting individual corrupt objects? Anyway, as I said. files in the PG are identified and under backup, so I just want to healthy, no matter what ;-) I actually discovered that removing the pgs shards, with objectstore-tool indeed works in getting the pg back active-clean (containing 0 objects though). One just need to run a final remove - start/stop OSD - repair - mark-complete on the primary OSD. A scrub tells me that the "active+clean” state is for real. I also found out the more automated "force-create-pg" command only works on pgs that a in down state. Best, Jesper -------------------------- Jesper Lykkegaard Karlsen Scientific Computing Centre for Structural Biology Department of Molecular Biology and Genetics Aarhus University Universitetsbyen 81 8000 Aarhus C E-mail: jelka@xxxxxxxxx Tlf: +45 50906203 > On 20 Sep 2022, at 15.40, Josh Baergen <jbaergen@xxxxxxxxxxxxxxxx> wrote: > > Hi Jesper, > > Given that the PG is marked recovery_unfound, I think you need to > follow https://docs.ceph.com/en/quincy/rados/troubleshooting/troubleshooting-pg/#unfound-objects. > > Josh > > On Tue, Sep 20, 2022 at 12:56 AM Jesper Lykkegaard Karlsen > <jelka@xxxxxxxxx> wrote: >> >> Dear all, >> >> System: latest Octopus, 8+3 erasure Cephfs >> >> I have a PG that has been driving me crazy. >> It had gotten to a bad state after heavy backfilling, combined with OSD going down in turn. >> >> State is: >> >> active+recovery_unfound+undersized+degraded+remapped >> >> I have tried repairing it with ceph-objectstore-tool, but no luck so far. >> Given the time recovery takes this way and since data are under backup, I thought that I would do the "easy" approach instead and: >> >> * scan pg_files with cephfs-data-scan >> * delete data beloging to that pool >> * recreate PG with "ceph osd force-create-pg" >> * restore data >> >> Although, this has shown not to be so easy after all. >> >> ceph osd force-create-pg 20.13f --yes-i-really-mean-it >> >> seems to be accepted well enough with "pg 20.13f now creating, ok", but then nothing happens. >> Issuing the command again just gives a "pg 20.13f already creating" response. >> >> If I restart the primary OSD, then the pending force-create-pg disappears. >> >> I read that this could be due to crush map issue, but I have checked and that does not seem to be the case. >> >> Would it, for instance, be possible to do the force-create-pg manually with something like this?: >> >> * set nobackfill and norecovery >> * delete the pgs shards one by one >> * unset nobackfill and norecovery >> >> >> Any idea on how to proceed from here is most welcome. >> >> Thanks, >> Jesper >> >> >> -------------------------- >> Jesper Lykkegaard Karlsen >> Scientific Computing >> Centre for Structural Biology >> Department of Molecular Biology and Genetics >> Aarhus University >> Universitetsbyen 81 >> 8000 Aarhus C >> >> E-mail: jelka@xxxxxxxxx >> Tlf: +45 50906203 >> >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx