Re: Troubleshooting Incomplete PGs

Gregory Farnum <greg@xxxxxxxxxxx> · Tue, 28 Oct 2014 14:11:47 -0700



On Thu, Oct 23, 2014 at 6:41 AM, Chris Kitzmiller
<ckitzmiller@xxxxxxxxxxxxx> wrote:
> On Oct 22, 2014, at 8:22 PM, Craig Lewis wrote:
>
> Shot in the dark: try manually deep-scrubbing the PG.  You could also try
> marking various osd's OUT, in an attempt to get the acting set to include
> osd.25 again, then do the deep-scrub again.  That probably won't help
> though, because the pg query says it probed osd.25 already... actually , it
> doesn't.  osd.25 is in "probing_osds" not "probed_osds". The deep-scrub
> might move things along.
>
> Re-reading your original post, if you marked the slow osds OUT, but left
> them running, you should not have lost data.
>
>
> That's true. I just marked them out. I did lose osd.10 (in addition to
> out'ting those other two OSDs) so I'm not out of the woods yet.
>
> If the scrubs don't help, it's probably time to hop on IRC.
>
>
> When I issue the deep-scrub command the cluster just doesn't scrub it. Same
> for regular scrub. :(
>
> This pool was offering an RBD which I've lost my connection to and it won't
> remount so my data is totally inaccessible at the moment. Thanks for your
> help so far!

It looks like you are suffering from
http://tracker.ceph.com/issues/9752, which we've not yet seen in-house
but have had reported a few times. I suspect that Loic (CC'ed) would
like to discuss your cluster's history with you to try and narrow it
down.
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com