Your need to run "ceph pg deep-scrub 1.65" first
On Mon, Nov 12, 2018 at 2:20 PM K.C. Wong <kcwong@xxxxxxxxxxx> wrote:
Hi Brad,_______________________________________________I got the following:[root@mgmt01 ~]# ceph health detailHEALTH_ERR 1 pgs inconsistent; 1 scrub errorspg 1.65 is active+clean+inconsistent, acting [62,67,47]1 scrub errors[root@mgmt01 ~]# rados list-inconsistent-obj 1.65No scrub information available for pg 1.65error 2: (2) No such file or directory[root@mgmt01 ~]# rados list-inconsistent-snapset 1.65No scrub information available for pg 1.65error 2: (2) No such file or directoryRather odd output, I’d say; not that I understand whatthat means. I also tried ceph list-inconsistent-pg:[root@mgmt01 ~]# rados lspoolsrbdcephfs_datacephfs_metadata.rgw.rootdefault.rgw.controldefault.rgw.data.rootdefault.rgw.gcdefault.rgw.logctrl-pprodcorpcampdevdefault.rgw.users.uiddefault.rgw.users.keysdefault.rgw.buckets.indexdefault.rgw.buckets.datadefault.rgw.buckets.non-ec[root@mgmt01 ~]# for i in $(rados lspools); do rados list-inconsistent-pg $i; done[]["1.65"][][][][][][][][][][][][][][][][]So, that’d put the inconsistency in the cephfs_data pool.Thank you for your help,-kcK.C. WongM: +1 (408) 769-8235-----------------------------------------------------
Confidentiality Notice:
This message contains confidential information. If you are not the
intended recipient and received this message in error, any use or
distribution is strictly prohibited. Please also notify us
immediately by return e-mail, and delete this message from your
computer system. Thank you.
-----------------------------------------------------4096R/B8995EDE E527 CBE8 023E 79EA 8BBB 5C77 23A6 92E9 B899 5EDEOn Nov 11, 2018, at 5:43 PM, Brad Hubbard <bhubbard@xxxxxxxxxx> wrote:What does "rados list-inconsistent-obj <pg>" say?
Note that you may have to do a deep scrub to populate the output.
On Mon, Nov 12, 2018 at 5:10 AM K.C. Wong <kcwong@xxxxxxxxxxx> wrote:
Hi folks,
I would appreciate any pointer as to how I can resolve a
PG stuck in “active+clean+inconsistent” state. This has
resulted in HEALTH_ERR status for the last 5 days with no
end in sight. The state got triggered when one of the drives
in the PG returned I/O error. I’ve since replaced the failed
drive.
I’m running Jewel (out of centos-release-ceph-jewel) on
CentOS 7. I’ve tried “ceph pg repair <pg>” and it didn’t seem
to do anything. I’ve tried even more drastic measures such as
comparing all the files (using filestore) under that PG_head
on all 3 copies and then nuking the outlier. Nothing worked.
Many thanks,
-kc
K.C. Wong
kcwong@xxxxxxxxxxx
M: +1 (408) 769-8235
-----------------------------------------------------
Confidentiality Notice:
This message contains confidential information. If you are not the
intended recipient and received this message in error, any use or
distribution is strictly prohibited. Please also notify us
immediately by return e-mail, and delete this message from your
computer system. Thank you.
-----------------------------------------------------
4096R/B8995EDE E527 CBE8 023E 79EA 8BBB 5C77 23A6 92E9 B899 5EDE
hkps://hkps.pool.sks-keyservers.net
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Cheers,
Brad
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com