Re: ceph fs (meta) data inconsistent

Xiubo Li <xiubli@xxxxxxxxxx> · Tue, 7 Nov 2023 15:22:50 +0800

On 11/1/23 23:57, Gregory Farnum wrote:
We have seen issues like this a few times and they have all been 
kernel client bugs with CephFS’ internal “capability” file locking 
protocol. I’m not aware of any extant bugs like this in our code base, 
but kernel patches can take a long and winding path before they end up 
on deployed systems.

Most likely, if you were to restart some combination of the client 
which wrote the file and the client(s) reading it, the size would 
propagate correctly. As long as you’ve synced the data, it’s 
definitely present in the cluster.

Adding Xiubo, who has worked on these and may have other comments.
-Greg

As I remembered we have some fixes about snapshot like this in kclient. 
And also one ctime inconsistency issue in MDS.

But this sounds more like a MDS side bug. I just went through the 
kclient patches and didn't find any commit that is fixing this yet.

Thanks

- Xiubo

On Wed, Nov 1, 2023 at 7:16 AM Frank Schilder <frans@xxxxxx> wrote:

    Dear fellow cephers,

    today we observed a somewhat worrisome inconsistency on our ceph
    fs. A file created on one host showed up as 0 length on all other
    hosts:

    [user1@host1 h2lib]$ ls -lh
    total 37M
    -rw-rw---- 1 user1 user1  12K Nov  1 11:59 dll_wrapper.py

    [user2@host2 h2lib]# ls -l
    total 34
    -rw-rw----. 1 user1 user1     0 Nov  1 11:59 dll_wrapper.py

    [user1@host1 h2lib]$ cp dll_wrapper.py dll_wrapper.py.test
    [user1@host1 h2lib]$ ls -l
    total 37199
    -rw-rw---- 1 user1 user1    11641 Nov  1 11:59 dll_wrapper.py
    -rw-rw---- 1 user1 user1    11641 Nov  1 13:10 dll_wrapper.py.test

    [user2@host2 h2lib]# ls -l
    total 45
    -rw-rw----. 1 user1 user1     0 Nov  1 11:59 dll_wrapper.py
    -rw-rw----. 1 user1 user1 11641 Nov  1 13:10 dll_wrapper.py.test

    Executing a sync on all these hosts did not help. However,
    deleting the problematic file and replacing it with a copy seemed
    to work around the issue. We saw this with ceph kclients of
    different versions, it seems to be on the MDS side.

    How can this happen and how dangerous is it?

    ceph fs status (showing ceph version):

    # ceph fs status
    con-fs2 - 1662 clients
    =======
    RANK  STATE     MDS       ACTIVITY     DNS    INOS
     0    active  ceph-15  Reqs:   14 /s  2307k  2278k
     1    active  ceph-11  Reqs:  159 /s  4208k  4203k
     2    active  ceph-17  Reqs:    3 /s  4533k  4501k
     3    active  ceph-24  Reqs:    3 /s  4593k  4300k
     4    active  ceph-14  Reqs:    1 /s  4228k  4226k
     5    active  ceph-13  Reqs:    5 /s  1994k  1782k
     6    active  ceph-16  Reqs:    8 /s  5022k  4841k
     7    active  ceph-23  Reqs:    9 /s  4140k  4116k
            POOL           TYPE     USED  AVAIL
       con-fs2-meta1     metadata  2177G  7085G
       con-fs2-meta2       data       0   7085G
        con-fs2-data       data    1242T  4233T
    con-fs2-data-ec-ssd    data     706G  22.1T
       con-fs2-data2       data    3409T  3848T
    STANDBY MDS
      ceph-10
      ceph-08
      ceph-09
      ceph-12
    MDS version: ceph version 15.2.17
    (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)

    There is no health issue:

    # ceph status
      cluster:
        id:     abc
        health: HEALTH_WARN
                3 pgs not deep-scrubbed in time

      services:
        mon: 5 daemons, quorum ceph-01,ceph-02,ceph-03,ceph-25,ceph-26
    (age 9w)
        mgr: ceph-25(active, since 7w), standbys: ceph-26, ceph-01,
    ceph-03, ceph-02
        mds: con-fs2:8 4 up:standby 8 up:active
        osd: 1284 osds: 1279 up (since 2d), 1279 in (since 5d)

      task status:

      data:
        pools:   14 pools, 25065 pgs
        objects: 2.20G objects, 3.9 PiB
        usage:   4.9 PiB used, 8.2 PiB / 13 PiB avail
        pgs:     25039 active+clean
                 26    active+clean+scrubbing+deep

      io:
        client:   799 MiB/s rd, 55 MiB/s wr, 3.12k op/s rd, 1.82k op/s wr

    The inconsistency seems undiagnosed, I couldn't find anything
    interesting in the cluster log. What should I look for and where?

    I moved the folder to another location for diagnosis.
    Unfortunately, I don't have 2 clients any more showing different
    numbers, I see a 0 length now everywhere for the moved folder. I'm
    pretty sure though that the file still is non-zero length.

    Thanks for any pointers.
    =================
    Frank Schilder
    AIT Risø Campus
    Bygning 109, rum S14
    _______________________________________________
    ceph-users mailing list -- ceph-users@xxxxxxx
    To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx