On Sun, Dec 4, 2016 at 11:51 PM, Goncalo Borges <goncalo.borges@xxxxxxxxxxxxx> wrote: > Dear CephFSers. > > We are running ceph/cephfs in 10.2.2. All infrastructure is in the same > version (rados cluster, mons, mds and cephfs clients). We mount cephfs using > ceph-fuse. > > Last week I triggered some of my heavy users to delete data. In the > following example, the user in question decreased his usage from ~4.5TB to ~ > 600TB. However, some clients still did not update the usage (although > several days have passed by) while others are ok. > > From a point of view of the MDS, both types of client have healthy sessions. > See detailed info after this email. > > Trying to kick the session does not solve the issue. Probably only a remount > but users are heavily using the filesystem and I do not want to break things > for them now. > > > The only difference I can actually dig out between "good"/"bad" clients is > that the user continues with active bash sessions in the "bad" client (from > where he triggered the deletions) You're saying that the clients that actually did the deletions are the ones with the bad rstats, but the other clients are getting the updates? Really weird. > # lsof | grep user1 | grep ceph > bash 15737 user1 cwd DIR 0,24 5285584388909 > 1099514070586 /coepp/cephfs/mel/user1 > vim 19233 user1 cwd DIR 0,24 24521126 > 1099514340633 /coepp/cephfs/mel/user1/Analysis/ssdilep/scripts > vim 19233 user1 5u REG 0,24 16384 > 1099557935412 > /coepp/cephfs/mel/user1/Analysis/ssdilep/scripts/.histmgr.py.swp > bash 24187 user1 cwd DIR 0,24 826758558 > 1099514314315 /coepp/cephfs/mel/user1/Analysis > bash 24256 user1 cwd DIR 0,24 147600 > 1099514340621 /coepp/cephfs/mel/user1/Analysis/ssdilep/run > bash 24327 user1 cwd DIR 0,24 151068 > 1099514340590 /coepp/cephfs/mel/user1/Analysis/ssdilep/algs > bash 24394 user1 cwd DIR 0,24 151068 > 1099514340590 /coepp/cephfs/mel/user1/Analysis/ssdilep/algs > bash 24461 user1 cwd DIR 0,24 356436 > 1099514340614 /coepp/cephfs/mel/user1/Analysis/ssdilep/samples > bash 24528 user1 cwd DIR 0,24 24521126 > 1099514340633 /coepp/cephfs/mel/user1/Analysis/ssdilep/scripts > bash 24601 user1 cwd DIR 0,24 24521126 > 1099514340633 /coepp/cephfs/mel/user1/Analysis/ssdilep/scripts > > Is there a particular way to force the client to update these info? Do we > actually know what it is taking so so long to update it? Recursive statistics are meant to be updated somewhat lazily, but obviously they are meant to *eventually* update, so if days are going by without them catching up then that's a bug. Could you try and come up with a simple reproducer, perhaps with just two clients involved? John > Cheers > > Goncalo > > --- * --- > > > 1) Reports from a client which shows "obsolete" file/directory sizes: > > # ll -h /coepp/cephfs/mel/ | grep user1 > drwxr-xr-x 1 user1 coepp_mel 4.9T Oct 7 00:20 user1 > > # getfattr -d -m ceph /coepp/cephfs/mel/user1 > getfattr: Removing leading '/' from absolute path names > # file: coepp/cephfs/mel/user1 > ceph.dir.entries="10" > ceph.dir.files="1" > ceph.dir.rbytes="5285584388909" > ceph.dir.rctime="1480390891.09882864298" > ceph.dir.rentries="161047" > ceph.dir.rfiles="149669" > ceph.dir.rsubdirs="11378" > ceph.dir.subdirs="9" > > ---> Running following command in the client: > # ceph daemon /var/run/ceph/ceph-client.mount_user.asok mds_sessions > { > "id": 616794, > "sessions": [ > { > "mds": 0, > "addr": "<MDS IP>:6800\/1457", > "seq": 4884237, > "cap_gen": 0, > "cap_ttl": "2016-12-04 22:45:53.046697", > "last_cap_renew_request": "2016-12-04 22:44:53.046697", > "cap_renew_seq": 166765, > "num_caps": 1567318, > "state": "open" > } > ], > "mdsmap_epoch": 5224 > } > > ---> Running the following command in the mds: > # ceph daemon mds.rccephmds session ls > (...) > > { > "id": 616794, > "num_leases": 0, > "num_caps": 21224, > "state": "open", > "replay_requests": 0, > "completed_requests": 0, > "reconnecting": false, > "inst": "client.616794 <BAD CLIENT IP>:0\/68088301", > "client_metadata": { > "ceph_sha1": "45107e21c568dd033c2f0a3107dec8f0b0e58374", > "ceph_version": "ceph version 10.2.2 > (45107e21c568dd033c2f0a3107dec8f0b0e58374)", > "entity_id": "mount_user", > "hostname": "badclient.my.domain", > "mount_point": "\/coepp\/cephfs", > "root": "\/cephfs" > } > }, > > 2) Reports from a client which shows "good" file/directory sizes: > > # ll -h /coepp/cephfs/mel/ | grep user1 > drwxr-xr-x 1 user1 coepp_mel 576G Oct 7 00:20 user1 > > # getfattr -d -m ceph /coepp/cephfs/mel/user1 > getfattr: Removing leading '/' from absolute path names > # file: coepp/cephfs/mel/user1 > ceph.dir.entries="10" > ceph.dir.files="1" > ceph.dir.rbytes="617756983774" > ceph.dir.rctime="1480844101.09560671770" > ceph.dir.rentries="96519" > ceph.dir.rfiles="95091" > ceph.dir.rsubdirs="1428" > ceph.dir.subdirs="9" > > ---> Running following command in the client: > # ceph daemon /var/run/ceph/ceph-client.mount_user.asok mds_sessions > { > "id": 616338, > "sessions": [ > { > "mds": 0, > "addr": "<MDS IP>:6800\/1457", > "seq": 7851161, > "cap_gen": 0, > "cap_ttl": "2016-12-04 23:32:30.041978", > "last_cap_renew_request": "2016-12-04 23:31:30.041978", > "cap_renew_seq": 169143, > "num_caps": 311386, > "state": "open" > } > ], > "mdsmap_epoch": 5224 > } > > > ---> Running following command in the mds: > > { > "id": 616338, > "num_leases": 0, > "num_caps": 16078, > "state": "open", > "replay_requests": 0, > "completed_requests": 0, > "reconnecting": false, > "inst": "client.616338 <GOOD CLIENT IP>:0\/3807825927", > "client_metadata": { > "ceph_sha1": "45107e21c568dd033c2f0a3107dec8f0b0e58374", > "ceph_version": "ceph version 10.2.2 > (45107e21c568dd033c2f0a3107dec8f0b0e58374)", > "entity_id": "mount_user", > "hostname": "goodclient.my.domain", > "mount_point": "\/coepp\/cephfs", > "root": "\/cephfs" > } > }, > > > -- > Goncalo Borges > Research Computing > ARC Centre of Excellence for Particle Physics at the Terascale > School of Physics A28 | University of Sydney, NSW 2006 > T: +61 2 93511937 > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com