cephfs deep scrub error:

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Mon, 13 Mar 2017 11:28:47 +0100

Hi John,

Last week we updated our prod CephFS cluster to 10.2.6 (clients and
server side), and for the first time today we've got an object info
size mismatch:

I found this ticket you created in the tracker, which is why I've
emailed you: http://tracker.ceph.com/issues/18240

Here's the detail of our error:

2017-03-13 07:17:49.989297 osd.67 <snip>:6819/3441125 262 : cluster
[ERR] deep-scrub 1.3da 1:5bc0e9dc:::10000260f4b.00000003:head on disk
size (4187974) does not match object info size (4193094) adjusted for
ondisk to (4193094)

All three replica's have the same object size/md5sum:

# ls -l 10000260f4b.00000003__head_3B9703DA__1
-rw-r--r--. 1 ceph ceph 4187974 Mar 12 18:50
10000260f4b.00000003__head_3B9703DA__1
# md5sum 10000260f4b.00000003__head_3B9703DA__1
db1e1bab199b33fce3ad9195832626ef 10000260f4b.00000003__head_3B9703DA__1

And indeed the object info does not agree with the files on disk:

# ceph-dencoder type object_info_t import /tmp/attr1 decode dump_json
{
    "oid": {
        "oid": "10000260f4b.00000003",
        "key": "",
        "snapid": -2,
        "hash": 999752666,
        "max": 0,
        "pool": 1,
        "namespace": ""
    },
    "version": "5262'221037",
    "prior_version": "5262'221031",
    "last_reqid": "osd.67.0:1180241",
    "user_version": 221031,
    "size": 4193094,
    "mtime": "0.000000",
    "local_mtime": "0.000000",
    "lost": 0,
    "flags": 52,
    "snaps": [],
    "truncate_seq": 80,
    "truncate_size": 0,
    "data_digest": 2779145704,
    "omap_digest": 4294967295,
    "watchers": {}
}

PG repair doesn't handle these kind of corruptions, but I found a
recipe in an old thread to fix the object info with hexedit. Before
doing this I wanted to see if we can understand exactly how this is
possible.

I managed to find the exact cephfs file, and asked the user how they
created it. They said the file was the output of a make test on some
program. The make test was taking awhile, so they left their laptop,
and when they returned to the computer, the ssh connection to their
cephfs workstation had broken. I assume this means that the process
writing the file had been killed while writing to cephfs. But I don't
understand how a killed client process could result in inconsistent
object info.

Is there anything else needed to help debug this inconsistency?

Cheers, Dan
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com