Hi Kenneth,
You should check for drive or XFS related errors in /var/log/message
files on all nodes. We've had a similar issue in the past with a bad
block on a hard drive.
We've had to :
1. Stop the OSD associated to the drive that had a bad block, flush its
journal (ceph-osd -i $osd --flush-journal) and umount the filesystem,
2. Clear the bad blocks in the RAID/PERC Controller,
3. xfs_repair the partition, and partprobe the drive to start the OSD again,
4. ceph pg repair <pg_id>
Regards,
Frédéric.
Le 04/10/2017 à 14:02, Kenneth Waegeman a écrit :
Hi,
We have some inconsistency / scrub error on a Erasure coded pool, that
I can't seem to solve.
[root@osd008 ~]# ceph health detail
HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
pg 5.144 is active+clean+inconsistent, acting
[81,119,148,115,142,100,25,63,48,11,43]
1 scrub errors
In the log files, it seems there is 1 missing shard:
/var/log/ceph/ceph-osd.81.log.2.gz:2017-10-02 23:49:11.940624
7f0a9d7e2700 -1 log_channel(cluster) log [ERR] : 5.144s0 shard 63(7)
missing 5:2297a2e1:::10014e2d8d5.00000000:head
/var/log/ceph/ceph-osd.81.log.2.gz:2017-10-03 00:48:06.681941
7f0a9d7e2700 -1 log_channel(cluster) log [ERR] : 5.144s0 deep-scrub 1
missing, 0 inconsistent objects
/var/log/ceph/ceph-osd.81.log.2.gz:2017-10-03 00:48:06.681947
7f0a9d7e2700 -1 log_channel(cluster) log [ERR] : 5.144 deep-scrub 1
errors
I tried running ceph pg repair on the pg, but nothing changed. I also
tried starting a new deep-scrub on the osd 81 (ceph osd deep-scrub
81) but I don't see any deep-scrub starting at the osd.
How can we solve this ?
Thank you!
Kenneth
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com