Inconsistent rbd header

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is probably going in the same direction as the report by Oliver Francke.

Ceph is reporting an inconsistent PG. Running a scrub on the PG gave
me the folling messages:

2012-03-16 12:55:17.287415   log 2012-03-16 12:55:12.179529 osd.14
10.255.0.63:6818/2014 34280 : [ERR] 2.117 osd.0: soid
818bc117/vol-FTU86AEJ.rbd/headsize 0 != known size 112
2012-03-16 12:55:17.287415   log 2012-03-16 12:55:12.179555 osd.14
10.255.0.63:6818/2014 34281 : [ERR] 2.117 scrub 0 missing, 1
inconsistent objects
2012-03-16 12:55:17.287415   log 2012-03-16 12:55:12.181397 osd.14
10.255.0.63:6818/2014 34282 : [ERR] scrub 2.117
818bc117/vol-FTU86AEJ.rbd/head on disk size (112) does not match
object info size (0)
2012-03-16 12:55:17.287415   log 2012-03-16 12:55:12.181925 osd.14
10.255.0.63:6818/2014 34283 : [ERR] 2.117 scrub stat mismatch, got
956/955 objects, 0/0 clones, 3951263856/3951263744 bytes.
2012-03-16 12:55:17.287415   log 2012-03-16 12:55:12.181947 osd.14
10.255.0.63:6818/2014 34284 : [ERR] 2.117 scrub 2 errors


A "PG repair" fixed one error:

2012-03-16 12:56:27.288690   log 2012-03-16 12:56:21.598432 osd.14
10.255.0.63:6818/2014 34285 : [ERR] 2.117 osd.0: soid
818bc117/vol-FTU86AEJ.rbd/headsize 0 != known size 112
2012-03-16 12:56:27.288690   log 2012-03-16 12:56:21.598457 osd.14
10.255.0.63:6818/2014 34286 : [ERR] 2.117 repair 0 missing, 1
inconsistent objects
2012-03-16 12:56:27.288690   log 2012-03-16 12:56:21.600277 osd.14
10.255.0.63:6818/2014 34287 : [ERR] repair 2.117
818bc117/vol-FTU86AEJ.rbd/head on disk size (112) does not match
object info size (0)
2012-03-16 12:56:27.288690   log 2012-03-16 12:56:21.600805 osd.14
10.255.0.63:6818/2014 34288 : [ERR] 2.117 repair stat mismatch, got
956/955 objects, 0/0 clones, 3951263856/3951263744 bytes.
2012-03-16 12:56:27.288690   log 2012-03-16 12:56:21.600849 osd.14
10.255.0.63:6818/2014 34289 : [ERR] 2.117 repair 2 errors, 1 fixed


On the filesystem (XFS) I can see the corresponding file:

# ls -l  /ceph/osd.014/current/2.117_head/DIR_7/DIR_1/DIR_1/vol-FTU86AEJ.rbd__head_818BC117
-rw-r--r-- 1 root root 112 Mar  1 20:32 vol-FTU86AEJ.rbd__head_818BC117

and I can read the object with rbd info:

# rbd info vol-FTU86AEJ
rbd image 'vol-FTU86AEJ':
        size 102400 MB in 25600 objects
        order 22 (4096 KB objects)
        block_name_prefix: rb.0.32
        parent:  (pool -1)

What I do not understand, is the fact that ceph seems to think that
the object should not exist any longer.

Any hint's on how to proceed? - Please note that I can do only limited
testing, because the cluster is in production.

Thanks,
Christian
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux