Hi cephers,
Any ideas on how to proceed on the inconsistencies below? At the
moment our ceph setup has 5 of these - in all cases it seems like
some zero length objects that match across the three replicas, but
do not match the object info size. I tried running pg repair on one
of them, but it didn't repair the problem:
2016-12-20 16:24:40.870307 7f3e1a4b1700 0
log_channel(cluster) log [INF] : 6.92c repair starts
2016-12-20 16:27:06.183186 7f3e1a4b1700 -1
log_channel(cluster) log [ERR] : repair 6.92c
6:34932257:::1000187bbb5.00000009:head on disk size (0) does not
match object info size (3014656) adjusted for ondisk to
(3014656)
2016-12-20 16:27:35.885496 7f3e17cac700 -1
log_channel(cluster) log [ERR] : 6.92c repair 1 errors, 0 fixed
Any help/hints would be appreciated.
Thanks,
Andras
On 12/15/2016 10:13 AM, Andras Pataki
wrote:
Hi everyone,
Yesterday scrubbing turned up an inconsistency in one of our
placement groups. We are running ceph 10.2.3, using CephFS and
RBD for some VM images.
[root@hyperv017 ~]# ceph -s
cluster d7b33135-0940-4e48-8aa6-1d2026597c2f
health HEALTH_ERR
1 pgs inconsistent
1 scrub errors
noout flag(s) set
monmap e15: 3 mons at
{hyperv029=10.4.36.179:6789/0,hyperv030=10.4.36.180:6789/0,hyperv031=10.4.36.181:6789/0}
election epoch 27192, quorum 0,1,2
hyperv029,hyperv030,hyperv031
fsmap e17181: 1/1/1 up {0=hyperv029=up:active}, 2 up:standby
osdmap e342930: 385 osds: 385 up, 385 in
flags noout
pgmap v37580512: 34816 pgs, 5 pools, 673 TB data, 198
Mobjects
1583 TB used, 840 TB / 2423 TB avail
34809 active+clean
4 active+clean+scrubbing+deep
2 active+clean+scrubbing
1 active+clean+inconsistent
client io 87543 kB/s rd, 671 MB/s wr, 23 op/s rd, 2846 op/s wr
# ceph pg dump | grep inconsistent
6.13f1 4692 0 0 0 0 16057314767 3087
3087 active+clean+inconsistent 2016-12-14 16:49:48.391572
342929'41011 342929:43966 [158,215,364] 158
[158,215,364] 158 342928'40540 2016-12-14
16:49:48.391511 342928'40540 2016-12-14 16:49:48.391511
I tried a couple of other deep scrubs on pg 6.13f1 but got
repeated errors. In the OSD logs:
2016-12-14 16:48:07.733291 7f3b56e3a700 -1 log_channel(cluster)
log [ERR] : deep-scrub 6.13f1
6:8fc91b77:::1000187bb70.00000009:head on disk size (0) does not
match object info size (1835008) adjusted for ondisk to (1835008)
I looked at the objects on the 3 OSD's on their respective hosts
and they are the same, zero length files:
# cd ~ceph/osd/ceph-158/current/6.13f1_head
# find . -name *1000187bb70* -ls
669738 0 -rw-r--r-- 1 ceph ceph 0 Dec 13 17:00
./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.00000009__head_EED893F1__6
# cd ~ceph/osd/ceph-215/current/6.13f1_head
# find . -name *1000187bb70* -ls
539815647 0 -rw-r--r-- 1 ceph ceph 0 Dec 13
17:00
./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.00000009__head_EED893F1__6
# cd ~ceph/osd/ceph-364/current/6.13f1_head
# find . -name *1000187bb70* -ls
1881432215 0 -rw-r--r-- 1 ceph ceph 0 Dec 13
17:00
./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.00000009__head_EED893F1__6
At the time of the write, there wasn't anything unusual going on
as far as I can tell (no hardware/network issues, all processes
were up, etc).
This pool is a CephFS data pool, and the corresponding file (inode
hex 1000187bb70, decimal 1099537300336) looks like this:
# ls -li chr4.tags.tsv
1099537300336 -rw-r--r-- 1 xichen xichen 14469915 Dec 13 17:01
chr4.tags.tsv
Reading the file is also ok (no errors, right number of bytes):
# cat chr4.tags.tsv > /dev/null
# wc chr4.tags.tsv
592251 2961255 14469915 chr4.tags.tsv
We are using the standard 4MB block size for CephFS, and if I
interpret this right, this is the 9th chunk, so there shouldn't be
any data (or even a 9th chunk), since the file is only 14MB.
Should I run pg repair on this? Any ideas on how this could come
about? Any other recommendations?
Thanks,
Andras
apataki@xxxxxxxxxxx
|