Re: Corrupted files on CephFS since Luminous upgrade

Jan Pekař - Imatic <jan.pekar@xxxxxxxxx> · Sat, 3 Mar 2018 10:07:00 +0100

Hi all,

thank you for reply. I will answer your questions, try to reproduce it 
and if I succeed, start new thread. It can take a while, I'm quiet busy.

My cluster was upgraded from Hammer or Jewel.

Luminous cluster was healthy wen I started my test. It could happen, 
that load temporarily caused health change (osd out and back in). After 
test, my cluster was again healthy.

My first test was from separate client with cephfs kernel client. Second 
test was local cephfs-fuse copying. Both created some file(s) with md5fail.

Before I run tests again, I will upgrade to latest Luminous and let you 
know.

With regards
Jan Pekar

On 28.2.2018 15:14, David C wrote:

On 27 Feb 2018 06:46, "Jan Pekař - Imatic" <jan.pekar@xxxxxxxxx 
<mailto:jan.pekar@xxxxxxxxx>> wrote:

    I think I hit the same issue.
    I have corrupted data on cephfs and I don't remember the same issue
    before Luminous (i did the same tests before).

    It is on my test 1 node cluster with lower memory then recommended
    (so server is swapping) but it shouldn't lose data (it never did
    before).
    So slow requests may appear in the log like Florent B mentioned.

    My test is to take some bigger files (few GB) and copy it to cephfs
    or from cephfs to cephfs and stress the cluster so data copying
    stall for a while. It will resume in few seconds/minutes and
    everything looks ok (no error on copying). But copied file may be
    corrupted silently.

    I checked wiles with MD5SUM and compared some corrupted files in
    detail. There were missing some 4MB blocks of data (cephfs object
    size) - corrupted file had that block of data filled with zeroes.

    My idea is, that there happen something wrong when cluster is under
    pressure and client want to save the block. Client gets OK and
    continues with another block so data is lost and corrupted block is
    filled with zeros.

    I tried kernel client 4.x and ceph-fuse client with same result.

    I'm using erasure for cephfs data pool, cache tier and my storage is
    bluestore and filestore mixed.

    How can I help to debug or what should I do to help to find the problem?

Always worrying to see the dreaded C word. I operate a Luminous cluster 
with a pretty varied workload and have yet to see any signs of 
corruption, although of course that doesn't mean its not happening. 
Initial questions:

- What's the history of your cluster? Was this an upgrade or fresh 
Luminous install?
- Was ceph healthy when you ran this test?
-Are you accessing this one node cluster from the node itself or from a 
separate client?

I'd recommend starting a new thread with more details, it sounds like 
it's pretty reproducable for you so maybe crank up your debugging and 
send logs. 
http://docs.ceph.com/docs/luminous/dev/kernel-client-troubleshooting/

    With regards
    Jan Pekar

    On 14.12.2017 15:41, Yan, Zheng wrote:

        On Thu, Dec 14, 2017 at 8:52 PM, Florent B <florent@xxxxxxxxxxx
        <mailto:florent@xxxxxxxxxxx>> wrote:

            On 14/12/2017 03:38, Yan, Zheng wrote:

                On Thu, Dec 14, 2017 at 12:49 AM, Florent B
                <florent@xxxxxxxxxxx <mailto:florent@xxxxxxxxxxx>> wrote:

                    Systems are on Debian Jessie : kernel 3.16.0-4-amd64
                    & libfuse 2.9.3-15.

                    I don't know pattern of corruption, but according to
                    error message in
                    Dovecot, it seems to expect data to read but reach EOF.

                    All seems fine using fuse_disable_pagecache (no more
                    corruption, and
                    performance increased : no more MDS slow requests on
                    filelock requests).

                I checked ceph-fuse changes since kraken, didn't find
                any clue. I
                would be helpful if you can try recent version kernel.

                Regards
                Yan, Zheng

            Problem occurred this morning even with
            fuse_disable_pagecache=true.

            It seems to be a lock issue between imap & lmtp processes.

            Dovecot uses fcntl as locking method. Is there any change
            about it in
            Luminous ? I switched to flock to see if problem is still
            there...

        I don't remenber there is any change.
        _______________________________________________
        ceph-users mailing list
        ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
        http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
        <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>

    -- 
    ============
    Ing. Jan Pekař
    jan.pekar@xxxxxxxxx <mailto:jan.pekar@xxxxxxxxx> | +420603811737
    <tel:%2B420603811737>
    ----
    Imatic | Jagellonská 14 | Praha 3 | 130 00
    http://www.imatic.cz
    ============
    --

    _______________________________________________
    ceph-users mailing list
    ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
    <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>

Kll

--
============
Ing. Jan Pekař
jan.pekar@xxxxxxxxx | +420603811737
----
Imatic | Jagellonská 14 | Praha 3 | 130 00
http://www.imatic.cz
============
--
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com