Hi Ceph community, Env: hammer 0.94.2, Scientific Linux 6.6, kernel 2.6.32-431.5.1.el6.x86_64 We wanted to post here before the tracker to see if someone else has had this problem. We have a few PGs (different pools) which get marked inconsistent when we stop the primary OSD. The problem is strange because once we restart the primary, then scrub the PG, the PG is marked active+clean. But inevitably next time we stop the primary OSD, the same PG is marked inconsistent again. There is no user activity on this PG, and nothing interesting is logged in any of the 2nd/3rd OSDs (with debug_osd=20, the first line mentioning the PG already says inactive+inconsistent). We suspect this is related to garbage files left in the PG folder. One of our PGs is acting basically like above, except it goes through this cycle: active+clean -> (deep-scrub) -> active+clean+inconsistent -> (repair) -> active+clean -> (restart primary OSD) -> (deep-scrub) -> active+clean+inconsistent. This one at least logs: 2015-07-22 16:42:41.821326 osd.303 [INF] 55.10d deep-scrub starts 2015-07-22 16:42:41.823834 osd.303 [ERR] 55.10d deep-scrub stat mismatch, got 0/1 objects, 0/0 clones, 0/1 dirty, 0/0 omap, 0/0 hit_set_archive, 0/0 whiteouts, 0/0 bytes,0/0 hit_set_archive bytes. 2015-07-22 16:42:41.823842 osd.303 [ERR] 55.10d deep-scrub 1 errors and this should be debuggable because there is only one object in the pool: tapetest 55 0 0 73575G 1 even though rados ls returns no objects: # rados ls -p tapetest # Any ideas? Cheers, Dan _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com