Hammer: PGs stuck creating

Brian Felton <bjfelton@xxxxxxxxx> · Wed, 29 Jun 2016 12:22:17 -0500

Greetings,

I have a lab cluster running Hammer 0.94.6 and being used exclusively for object storage.  The cluster consists of four servers running 60 6TB OSDs each.  The main .rgw.buckets pool is using k=3 m=1 erasure coding and contains 8192 placement groups.

Last week, one of our guys out-ed and removed one OSD from each of three of the four servers in the cluster, which resulted in some general badness (the disks were wiped post-removal, so the data are gone).  After a proper education in why this is a Bad Thing, we got the OSDs added back.  When all was said and done, we had 30 pgs that were stuck incomplete, and no amount of magic has been able to get them to recover.  From reviewing the data, we knew that all of these pgs contained at least 2 of the removed OSDs; I understand and accept that the data are gone, and that's not a concern (yay lab).  

Here are the things I've tried:

- Restarted all OSDs
- Stopped all OSDs, removed all OSDs from the crush map, and started everything back up
- Executed a 'ceph pg force_create_pg <id>' for each of the 30 stuck pgs
- Executed a 'ceph pg send_pg_creates' to get the ball rolling on creates
- Executed several 'ceph pg <id> query' commands to ensure we were referencing valid OSDs after the 'force_create_pg'
- Ensured those OSDs were really removed (e.g. 'ceph auth del', 'ceph osd crush remove', and 'ceph osd rm')

At this point, I've got the same 30 pgs that are stuck creating.  I've run out of ideas for getting this back to a healthy state.  In reviewing the other posts on the mailing list, the overwhelming solution was a bad OSD in the crush map, but I'm all but certain that isn't what's hitting us here.  Normally, being the lab, I'd consider nuking the .rgw.buckets pool and starting from scratch, but we've recently spent a lot of time pulling 140TB of data into this cluster for some performance and recovery tests, and I'd prefer not to have to start that process again.  I am willing to entertain most any other idea irrespective to how destructive it is to these PGs, so long as I don't have to lose the rest of the data in the pool.

Many thanks in advance for any assistance here.

Brian Felton

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com