Inactive pgs preventing osd from starting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,

Long story-short, we’re doing disaster recovery on a cephfs cluster, and are at a point where we have 8 pgs stuck incomplete.  Just before the disaster, I increased the pg_count on two of the pools, and they had not completed increasing the pgp_num yet.  I’ve since forced pgp_num to the current values.

So far, I’ve tried mark_unfound_lost but they don’t report any unfound objects, and I’ve tried force-create-pg but that has no effect, except on one of the pgs, which went to creating+incomplete.  During the disaster recovery, I had to re-create several OSDs (due to unreadable superblocks,) and now one of the new osds, as well as one of the existing osds won’t start.  The log from the startup of osd.29 is here: https://pastebin.com/PX9AAj8m, which seems to indicate that it won’t start because it’s supposed to have copies of the incomplete placement groups.

ceph pg 5.38 query (one of the incomplete) gives: https://pastebin.com/Jf4GnZTc

I have hunted around in the osds listed for all the placement groups for any sign of a pg that I could mark as complete with ceph-objectstore-tool, but can’t find any.  I don’t care about the data in the pgs, but I can’t abandon the filesystem.

Any help would be greatly appreciated.

-TJ Ragan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux