Sure thing, n.b. I increased pg count to see if it would help. Alas not. :) Thanks again! health_detail https://gist.github.com/199bab6d3a9fe30fbcae osd_dump https://gist.github.com/499178c542fa08cc33bb osd_tree https://gist.github.com/02b62b2501cbd684f9b2 Random selected queries: queries/0.19.query https://gist.github.com/f45fea7c85d6e665edf8 queries/1.a1.query https://gist.github.com/dd68fbd5e862f94eb3be queries/7.100.query https://gist.github.com/d4fd1fb030c6f2b5e678 queries/7.467.query https://gist.github.com/05dbcdc9ee089bd52d0c On Tue, Mar 10, 2015 at 2:49 PM, Samuel Just <sjust@xxxxxxxxxx> wrote: > Yeah, get a ceph pg query on one of the stuck ones. > -Sam > > On Tue, 2015-03-10 at 14:41 +0000, joel.merrick@xxxxxxxxx wrote: >> Stuck unclean and stuck inactive. I can fire up a full query and >> health dump somewhere useful if you want (full pg query info on ones >> listed in health detail, tree, osd dump etc). There were blocked_by >> operations that no longer exist after doing the OSD addition. >> >> Side note, spent some time yesterday writing some bash to do this >> programatically (might be useful to others, will throw on github) >> >> On Tue, Mar 10, 2015 at 1:41 PM, Samuel Just <sjust@xxxxxxxxxx> wrote: >> > What do you mean by "unblocked" but still "stuck"? >> > -Sam >> > >> > On Mon, 2015-03-09 at 22:54 +0000, joel.merrick@xxxxxxxxx wrote: >> >> On Mon, Mar 9, 2015 at 2:28 PM, Samuel Just <sjust@xxxxxxxxxx> wrote: >> >> > You'll probably have to recreate osds with the same ids (empty ones), >> >> > let them boot, stop them, and mark them lost. There is a feature in the >> >> > tracker to improve this behavior: http://tracker.ceph.com/issues/10976 >> >> > -Sam >> >> >> >> Thanks Sam, I've readded the OSDs, they became unblocked but there are >> >> still the same number of pgs stuck. I looked at them in some more >> >> detail and it seems they all have num_bytes='0'. Tried a repair too, >> >> for good measure. Still nothing I'm afraid. >> >> >> >> Does this mean some underlying catastrophe has happened and they are >> >> never going to recover? Following on, would that cause data loss. >> >> There are no missing objects and I'm hoping there's appropriate >> >> checksumming / replicas to balance that out, but now I'm not so sure. >> >> >> >> Thanks again, >> >> Joel >> > >> > >> >> >> > > -- $ echo "kpfmAdpoofdufevq/dp/vl" | perl -pe 's/(.)/chr(ord($1)-1)/ge' _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com