ceph-fuse clients taking too long to update dir sizes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear CephFSers.

We are running ceph/cephfs in 10.2.2. All infrastructure is in the same version (rados cluster, mons, mds and cephfs clients). We mount cephfs using ceph-fuse.

Last week I triggered some of my heavy users to delete data. In the following example, the user in question decreased his usage from ~4.5TB to ~ 600TB. However, some clients still did not update the usage (although several days have passed by) while others are ok.

From a point of view of the MDS, both types of client have healthy sessions. See detailed info after this email.

Trying to kick the session does not solve the issue. Probably only a remount but users are heavily using the filesystem and I do not want to break things for them now.


The only difference I can actually dig out between "good"/"bad" clients is that the user continues with active bash sessions in the "bad" client (from where he triggered the deletions)

# lsof | grep user1 | grep ceph
bash      15737   user1  cwd       DIR               0,24 5285584388909 1099514070586 /coepp/cephfs/mel/user1
vim       19233   user1  cwd       DIR               0,24      24521126 1099514340633 /coepp/cephfs/mel/user1/Analysis/ssdilep/scripts
vim       19233   user1    5u      REG               0,24         16384 1099557935412 /coepp/cephfs/mel/user1/Analysis/ssdilep/scripts/.histmgr.py.swp
bash      24187   user1  cwd       DIR               0,24     826758558 1099514314315 /coepp/cephfs/mel/user1/Analysis
bash      24256   user1  cwd       DIR               0,24        147600 1099514340621 /coepp/cephfs/mel/user1/Analysis/ssdilep/run
bash      24327   user1  cwd       DIR               0,24        151068 1099514340590 /coepp/cephfs/mel/user1/Analysis/ssdilep/algs
bash      24394   user1  cwd       DIR               0,24        151068 1099514340590 /coepp/cephfs/mel/user1/Analysis/ssdilep/algs
bash      24461   user1  cwd       DIR               0,24        356436 1099514340614 /coepp/cephfs/mel/user1/Analysis/ssdilep/samples
bash      24528   user1  cwd       DIR               0,24      24521126 1099514340633 /coepp/cephfs/mel/user1/Analysis/ssdilep/scripts
bash      24601   user1  cwd       DIR               0,24      24521126 1099514340633 /coepp/cephfs/mel/user1/Analysis/ssdilep/scripts

Is there a particular way to force the client to update these info? Do we actually know what it is taking so so long to update it?

Cheers

Goncalo

--- * ---


1) Reports from a client which shows "obsolete" file/directory sizes:

# ll -h /coepp/cephfs/mel/ | grep user1
drwxr-xr-x 1 user1      coepp_mel 4.9T Oct  7 00:20 user1

# getfattr -d -m ceph /coepp/cephfs/mel/user1
getfattr: Removing leading '/' from absolute path names
# file: coepp/cephfs/mel/user1
ceph.dir.entries="10"
ceph.dir.files="1"
ceph.dir.rbytes="5285584388909"
ceph.dir.rctime="1480390891.09882864298"
ceph.dir.rentries="161047"
ceph.dir.rfiles="149669"
ceph.dir.rsubdirs="11378"
ceph.dir.subdirs="9"

---> Running following command in the client:
# ceph daemon /var/run/ceph/ceph-client.mount_user.asok mds_sessions
{
    "id": 616794,
    "sessions": [
        {
            "mds": 0,
            "addr": "<MDS IP>:6800\/1457",
            "seq": 4884237,
            "cap_gen": 0,
            "cap_ttl": "2016-12-04 22:45:53.046697",
            "last_cap_renew_request": "2016-12-04 22:44:53.046697",
            "cap_renew_seq": 166765,
            "num_caps": 1567318,
            "state": "open"
        }
    ],
    "mdsmap_epoch": 5224
}

---> Running the following command in the mds:
# ceph daemon mds.rccephmds session ls
(...)

   {
        "id": 616794,
        "num_leases": 0,
        "num_caps": 21224,
        "state": "open",
        "replay_requests": 0,
        "completed_requests": 0,
        "reconnecting": false,
        "inst": "client.616794 <BAD CLIENT IP>:0\/68088301",
        "client_metadata": {
            "ceph_sha1": "45107e21c568dd033c2f0a3107dec8f0b0e58374",
            "ceph_version": "ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)",
            "entity_id": "mount_user",
            "hostname": "badclient.my.domain",
            "mount_point": "\/coepp\/cephfs",
            "root": "\/cephfs"
        }
    },

2) Reports from a client which shows "good" file/directory sizes:

# ll -h /coepp/cephfs/mel/ | grep user1
drwxr-xr-x 1 user1      coepp_mel 576G Oct  7 00:20 user1

# getfattr -d -m ceph /coepp/cephfs/mel/user1
getfattr: Removing leading '/' from absolute path names
# file: coepp/cephfs/mel/user1
ceph.dir.entries="10"
ceph.dir.files="1"
ceph.dir.rbytes="617756983774"
ceph.dir.rctime="1480844101.09560671770"
ceph.dir.rentries="96519"
ceph.dir.rfiles="95091"
ceph.dir.rsubdirs="1428"
ceph.dir.subdirs="9"

---> Running following command in the client:
# ceph daemon /var/run/ceph/ceph-client.mount_user.asok mds_sessions
{
    "id": 616338,
    "sessions": [
        {
            "mds": 0,
            "addr": "<MDS IP>:6800\/1457",
            "seq": 7851161,
            "cap_gen": 0,
            "cap_ttl": "2016-12-04 23:32:30.041978",
            "last_cap_renew_request": "2016-12-04 23:31:30.041978",
            "cap_renew_seq": 169143,
            "num_caps": 311386,
            "state": "open"
        }
    ],
    "mdsmap_epoch": 5224
}

    ---> Running following command in the mds:
    {
        "id": 616338,
        "num_leases": 0,
        "num_caps": 16078,
        "state": "open",
        "replay_requests": 0,
        "completed_requests": 0,
        "reconnecting": false,
        "inst": "client.616338 <GOOD CLIENT IP>:0\/3807825927",
        "client_metadata": {
            "ceph_sha1": "45107e21c568dd033c2f0a3107dec8f0b0e58374",
            "ceph_version": "ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)",
            "entity_id": "mount_user",
            "hostname": "goodclient.my.domain",
            "mount_point": "\/coepp\/cephfs",
            "root": "\/cephfs"
        }
    },

-- 
Goncalo Borges
Research Computing
ARC Centre of Excellence for Particle Physics at the Terascale
School of Physics A28 | University of Sydney, NSW  2006
T: +61 2 93511937
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux