getattr - failed to rdlock waiting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Folks,

 

I am looking for advice on how to troubleshoot some long operations found in MDS. Most of the time performance is fantastic, but occasionally and to no real pattern or trend, a gettattr op will take up to ~30 seconds to complete in MDS which is stuck on "event": "failed to rdlock, waiting"

 

E.g.

"description": "client_request(client.84183:54794012 getattr pAsLsXsFs #0x10000038585 2018-10-02 07:56:27.554282 caller_uid=48, caller_gid=48{})",

"duration": 28.987992,

{

"time": "2018-09-25 07:56:27.552511",

"event": "failed to rdlock, waiting"

},

{

"time": "2018-09-25 07:56:56.529748",

"event": "failed to rdlock, waiting"

},

{

"time": "2018-09-25 07:56:56.540386",

"event": "acquired locks"

}

 

I can find no corresponding long op on any of the OSDs and no other op in MDS which this one could be waiting for.

Nearly all configuration will be the default. Currently have a small amount of data which is constantly being updated. 1 data pool and 1 metadata pool.

How can I track down what is holding up this op and try to stop it happening?

 

# rados df

total_objects    191

total_used       5.7 GiB

total_avail      367 GiB

total_space      373 GiB

 

 

Cephfs version 13.2.1 on CentOs 7.5

Kernel: 3.10.0-862.11.6.el7.x86_64

1x Active MDS, 1x Replay Standby MDS

3x MON

4x OSD

Bluestore FS

 

Ceph kernel client on CentOs 7.4

Kernel: 4.18.7-1.el7.elrepo.x86_64  (almost the latest, should be good?)

 

Many Thanks!

Tom

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux