> From: Eugen Block [mailto:eblock@xxxxxx] > > From what I understand, with a rep size of 2 the cluster can't decide > which object is intact if one is broken, so the repair fails. If you > had a size of 3, the cluster would see 2 intact objects an repair the > broken one (I guess). At least we didn't have these inconsistencies > since we increased the size to 3. I understand. Anyway, we have a healthy cluster again :) After few ERR in logs... 2017-01-26 06:08:48.147132 osd.3 192.168.12.150:6802/5421 129 : cluster [ERR] 10.55 shard 3: soid 10:aa0c6d9c:::ef4069bf-70fb-4414-a9d9-6bf5b32608fb.34127.33_nalazi%2f201607%2fLab_7bd28004-cc9d-4039-9567-7f5c597f6d88.pdf:head data_digest 0xc44df2ba != known data_digest 0xff59029 from auth shard 4 2017-01-26 06:19:55.708510 osd.3 192.168.12.150:6802/5421 130 : cluster [ERR] 10.55 deep-scrub 0 missing, 1 inconsistent objects 2017-01-26 06:19:55.708514 osd.3 192.168.12.150:6802/5421 131 : cluster [ERR] 10.55 deep-scrub 1 errors 2017-01-26 10:00:48.267405 osd.3 192.168.12.150:6806/18501 2 : cluster [ERR] 10.55 shard 3 missing 10:aa0c6d9c:::ef4069bf-70fb-4414-a9d9-6bf5b32608fb.34127.33_nalazi%2f201607%2fLab_7bd28004-cc9d-4039-9567-7f5c597f6d88.pdf:head 2017-01-26 10:06:56.062854 osd.3 192.168.12.150:6806/18501 3 : cluster [ERR] 10.55 scrub 1 missing, 0 inconsistent objects 2017-01-26 10:06:56.062859 osd.3 192.168.12.150:6806/18501 4 : cluster [ERR] 10.55 scrub 1 errors ( 1 remaining deep scrub error(s) ) 2017-01-26 12:54:45.748066 osd.3 192.168.12.150:6806/18501 18 : cluster [ERR] 10.55 shard 3: soid 10:aa0c6d9c:::ef4069bf-70fb-4414-a9d9-6bf5b32608fb.34127.33_nalazi%2f201607%2fLab_7bd28004-cc9d-4039-9567-7f5c597f6d88.pdf:head size 0 != known size 52102, missing attr _, missing attr _user.rgw.acl, missing attr _user.rgw.content_type, missing attr _user.rgw.etag, missing attr _user.rgw.idtag, missing attr _user.rgw.manifest, missing attr _user.rgw.pg_ver, missing attr _user.rgw.source_zone, missing attr _user.rgw.x-amz-acl, missing attr _user.rgw.x-amz-date, missing attr snapset 2017-01-26 13:02:18.014584 osd.3 192.168.12.150:6806/18501 19 : cluster [ERR] 10.55 scrub 0 missing, 1 inconsistent objects 2017-01-26 13:02:18.014607 osd.3 192.168.12.150:6806/18501 20 : cluster [ERR] 10.55 scrub 1 errors ( 1 remaining deep scrub error(s) ) 2017-01-26 13:16:56.634322 osd.3 192.168.12.150:6806/18501 22 : cluster [ERR] 10.55 shard 3: soid 10:aa0c6d9c:::ef4069bf-70fb-4414-a9d9-6bf5b32608fb.34127.33_nalazi%2f201607%2fLab_7bd28004-cc9d-4039-9567-7f5c597f6d88.pdf:head data_digest 0xffffffff != known data_digest 0xff59029 from auth shard 4, size 0 != known size 52102, missing attr _, missing attr _user.rgw.acl, missing attr _user.rgw.content_type, missing attr _user.rgw.etag, missing attr _user.rgw.idtag, missing attr _user.rgw.manifest, missing attr _user.rgw.pg_ver, missing attr _user.rgw.source_zone, missing attr _user.rgw.x-amz-acl, missing attr _user.rgw.x-amz-date, missing attr snapset We got this: 2017-01-26 13:31:04.577603 osd.3 192.168.12.150:6806/18501 23 : cluster [ERR] 10.55 repair 0 missing, 1 inconsistent objects 2017-01-26 13:31:04.596102 osd.3 192.168.12.150:6806/18501 24 : cluster [ERR] 10.55 repair 1 errors, 1 fixed And... # ceph -s cluster 2bf80721-fceb-4b63-89ee-1a5faa278493 health HEALTH_OK monmap e1: 1 mons at {cephadm01=192.168.12.150:6789/0} election epoch 7, quorum 0 cephadm01 osdmap e580: 9 osds: 9 up, 9 in flags sortbitwise pgmap v11436879: 664 pgs, 13 pools, 1011 GB data, 13900 kobjects 2143 GB used, 2354 GB / 4497 GB avail 661 active+clean 3 active+clean+scrubbing Your method worked! Thank you for you time and help! I will see if we can add some more disk to set the replica to 3. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com