On Fri, Jan 1, 2016 at 12:15 PM, Bryan Wright <bkw1a@xxxxxxxxxxxx> wrote: > Hi folks, > > "ceph pg dump_stuck inactive" shows: > > 0.e8 incomplete [406,504] 406 [406,504] 406 > > Each of the osds above is alive and well, and idle. > > The output of "ceph pg 0.e8 query" is shown below. All of the osds it refers > to are alive and well, with the exception of osd 102 which died and has been > removed from the cluster. > > Can anyone look at this and tell me why this pg is incomplete? > > Bryan > > "ceph pg query" output is here, because it's so large: > > http://ayesha.phys.virginia.edu/~bryan/errant-pg.txt I can't parse all of that output, but the most important and easiest-to-understand bit is: "blocked_by": [ 102 ], And indeed in the past_intervals section there are a bunch where it's just 102. You really want min_size >=2 for exactly this reason. :/ But if you get 102 up stuff should recover; if you can't you can mark it as "lost" and RADOS ought to resume processing, with potential data/metadata loss... -Greg _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com