Hi, I had looked at the output of `ceph health detail` which told me to search for 'incomplete' in the docs. Since that said to file a bug (and I was sure that filing a bug did not help) I continued to purge the Disks that we hat overwritten and ceph then did some magic and told me that the PGs were again available on three OSDs but were incomplete. I have now gone ahead and marked all three of the OSDs where one of my incomplete PGs is (according to `ceph pg ls incomplete`) as lost one by one, waiting for ceph status to settle in between and that lead to the PG now being incomplete on three different OSDs. Also, force-create-pg tells me "already created". Am 29.01.2020 schrieb Gregory Farnum: > There should be docs on how to mark an OSD lost, which I would expect to be > linked from the troubleshooting PGs page. > > There is also a command to force create PGs but I don’t think that will > help in this case since you already have at least one copy. > > On Tue, Jan 28, 2020 at 5:15 PM Hartwig Hauschild <ml-ceph@xxxxxxxxxxxx> > wrote: > > > Hi. > > > > before I descend into what happened and why it happened: I'm talking about > > a > > test-cluster so I don't really care about the data in this case. > > > > We've recently started upgrading from luminous to nautilus, and for us that > > means we're retiring ceph-disk in favour of ceph-volume with lvm and > > dmcrypt. > > > > Our setup is in containers and we've got DBs separated from Data. > > When testing our upgrade-path we discovered that running the host on > > ubuntu-xenial and the containers on centos-7.7 leads to lvm inside the > > containers not using lvmetad because it's too old. That in turn means that > > not running `vgscan --cache` on the host before adding a LV to a VG > > essentially zeros the metadata for all LVs in that VG. > > > > That happened on two out of three hosts for a bunch of OSDs and those OSDs > > are gone. I have no way of getting them back, they've been overwritten > > multiple times trying to figure out what went wrong. > > > > So now I have a cluster that's got 16 pgs in 'incomplete', 14 of them with > > 0 > > objects, 2 with about 150 objects each. > > > > I have found a couple of howtos that tell me to use ceph-objectstore-tool > > to > > find the pgs on the active osds and I've given that a try, but > > ceph-objectstore-tool always tells me it can't find the pg I am looking > > for. > > > > Can I tell ceph to re-init the pgs? Do I have to delete the pools and > > recreate them? > > > > There's no data I can't get back in there, I just don't feel like > > scrapping and redeploying the whole cluster. > > > > > > -- > > Cheers, > > Hardy > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > -- Cheers, Hardy _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx