Re: How to force lost PGs

Gaylord Holder <gholder@xxxxxxxxxxxxx> · Tue, 03 Sep 2013 11:05:11 -0400

Awesome Sage!

I knew I had lost data.  I'm trying to find out what will happen when 
the worst happens (like the ceph administer is an idiot).

So those PGs are hanging around in a OSD/pool somewhere with some kind 
of reference count and they just need to be recreated?

Thanks again for unsticking me.

-Gaylord
On 09/03/2013 10:44 AM, Sage Weil wrote:
On Sun, 1 Sep 2013, Gaylord Holder wrote:

I created a pool with no replication and an RBD within that pool.  I mapped
the RBD to a machine, formatted it with a file system and dumped data on it.

Just to see what kind of trouble I can get into, I stopped the OSD the RBD was
using, marked the OSD as out, and reformatted the OSD tree.

When I brought the OSD back up, I now have three stale PGs.

Now I'm trying to clear the stale PGs.  I've tried removing the OSD from the
crush maps, the OSD lists etc, without any luck.

Note that this means that you destroyed all copies of those 3 PGs, which
means this experiment lost data.

You can make ceph recreate the PGs (empty!) with

  ceph pg force_create_pg <pgid>

sage

Running
   ceph pg 3.1 query
   ceph pg 3.1 mark_unfound_lost revert
ceph explains it doesn't have a PG 3.1

Running
  ceph osd repair osd.1
hangs after pg 2.3e

Running
   ceph osd lost 1 --yes-i-really-mean-it
nukes the osd.  Rebuilding osd.1 goes fine, but I still have 3 stale PGs.

Any help clearing these stale pages would be appreciated.

Thanks,
-Gaylord
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com