Correcting inconsistent pg in EC pool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I came across an inconsistent pg in our 4+2 EC storage pool (ceph 10.2.5). Since "ceph pg repair" wasn't able to correct it, I followed the general outline given in this thread

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-August/003965.html

# zgrep -Hn ERR /var/log/ceph/ceph-osd.368.log.*
/var/log/ceph/ceph-osd.368.log.1.gz:525:2017-03-19 23:41:11.736066 7f7a649d9700 -1 log_channel(cluster) log [ERR] : 70.319s0 shard 63(2): soid 70:98cb99a5:::default.539464.38__multipart_140411_SN261_0546_AC49YEACXX%2fsam%2fF10216D_CAAAAG_L004_001.sorted.dedup.realigned.recal.gvcf.2~LgDQTFVEBK6TSp2Kaw2Z3aylGsP_cRa.156:head candidate had a read error
/var/log/ceph/ceph-osd.368.log.1.gz:529:2017-03-19 23:47:47.160589 7f7a671de700 -1 log_channel(cluster) log [ERR] : 70.319s0 deep-scrub 0 missing, 1 inconsistent objects
/var/log/ceph/ceph-osd.368.log.1.gz:530:2017-03-19 23:47:47.160624 7f7a671de700 -1 log_channel(cluster) log [ERR] : 70.319 deep-scrub 1 errors

shows where the error lies, and on that osd:

/var/log/ceph/ceph-osd.63.log.1.gz:811:2017-03-19 23:41:11.657532 7f8d67f77700  0 osd.63 pg_epoch: 474876 pg[70.319s2( v 474876'387130 (474876'384063,474876'387130] local-les=474678 n=38859 ec=21494 les/c/f 474678/474682/0 474662/474673/474565) [368,151,63,313,432,272] r=2 lpr=474673 pi=135288-474672/1939 luod=0'0 crt=474876'387128 active NIBBLEWISE] _scan_list  70:98cb99a5:::default.539464.38__multipart_140411_SN261_0546_AC49YEACXX%2fsam%2fF10216D_CAAAAG_L004_001.sorted.dedup.realigned.recal.gvcf.2~LgDQTFVEBK6TSp2Kaw2Z3aylGsP_cRa.156:head got -5 on read, read_error

and indeed the file has a read error.

So I set the osd down, and used ceph-objectstore-tool to export then remove the affected pg (actually it couldn't export without first deleting the bad file).

after restarting the osd... and waiting for recovery... the pg directory and contents all appear to have been recreated, but the pg is still active+clean+inconsistent...

Am I missing something? "ceph pg repair" and "ceph pg scrub" also don't clear the inconsistency.

Thanks for any suggestions,

G.
--
Graham Allan
Minnesota Supercomputing Institute - gta@xxxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux