High cephfs MDS latency and CPU load

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi cephers,

Recently we've started using cephfs snapshots more - and seem to be running into a rather annoying performance issue with the MDS.  The cluster in question is on Nautilus 14.2.20.

Typically, the MDS processes a few thousand requests per second with all operations showing latencies in the few millisecond range (in mds perf dump) and the system seems quite responsive (directory listings, general metadata operations feel quick).  Every so often, the MDS transitions into a period of super high latency: 0.1 to 2 seconds per operation (as measured by increases in the latency counters in mds perf dump).  During these high latency periods, the request rate is about the same (low 1000s requests/s) - but one thread of the MDS called 'fn_anonymous' is 100% busy.  Pointing the debugger to it and getting a stack trace at random times always shows a similar picture:

#0  0x0000555555787d4c in CInode::get_oldest_parent_dn (this=0x555843e14700) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/CInode.h:780 #1  CInode::is_ancestor_of (this=0x55561ebbee00, other=other@entry=0x555843e14700) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/CInode.cc:917 #2  0x00005555557e4147 in SnapRealm::split_at (this=this@entry=0x555557c3a500, child=0x5555c1ad7900) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/SnapRealm.cc:593 #3  0x00005555557a1a5f in CInode::open_snaprealm (this=0x55561ebbee00, nosplit=<optimized out>) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/CInode.cc:2764 #4  0x00005555557a1e96 in CInode::pop_projected_snaprealm (this=0x55561ebbee00, next_snaprealm=0x555571b24000, early=<optimized out>) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/CInode.cc:586 #5  0x00005555557a2462 in CInode::early_pop_projected_snaprealm (this=this@entry=0x55561ebbee00) at /usr/include/c++/8/bits/stl_list.h:1122 #6  0x00005555555d1dc0 in Server::_unlink_local_finish (this=0x555557055680, mdr=..., dn=0x5558e50845a0, straydn=0x55582c1981e0, dnpv=513) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/Server.cc:7025 #7  0x00005555558067df in Context::complete (r=0, this=0x555715b043c0) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/include/Context.h:77 #8  MDSContext::complete (this=0x555715b043c0, r=0) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/MDSContext.cc:29 #9  0x0000555555806a8f in MDSIOContextBase::complete (this=0x555715b043c0, r=0) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/MDSContext.cc:114 #10 0x0000555555806d54 in MDSLogContextBase::complete (this=<optimized out>, r=0) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/MDSContext.cc:123 #11 0x00007fffef02062d in Finisher::finisher_thread_entry() () from target:/usr/lib64/ceph/libceph-common.so.0
#12 0x00007fffecd7d14a in start_thread () from target:/lib64/libpthread.so.0
#13 0x00007fffeb896dc3 in clone () from target:/lib64/libc.so.6

#0  0x0000555555787d37 in CInode::is_ancestor_of (this=0x55580f021800, other=0x5557c1b1c300, other@entry=0x5557c9146700) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/CDentry.h:150 #1  0x00005555557e4147 in SnapRealm::split_at (this=this@entry=0x555557ea0280, child=0x55584e1c1b80) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/SnapRealm.cc:593 #2  0x00005555557a1a5f in CInode::open_snaprealm (this=0x55580f021800, nosplit=<optimized out>) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/CInode.cc:2764 #3  0x00005555557a1e96 in CInode::pop_projected_snaprealm (this=0x55580f021800, next_snaprealm=0x55556425f140, early=<optimized out>) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/CInode.cc:586 #4  0x00005555557a2462 in CInode::early_pop_projected_snaprealm (this=this@entry=0x55580f021800) at /usr/include/c++/8/bits/stl_list.h:1122 #5  0x00005555555d1dc0 in Server::_unlink_local_finish (this=0x555556438500, mdr=..., dn=0x5558541e8f00, straydn=0x555946731860, dnpv=77222) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/Server.cc:7025 #6  0x00005555558067df in Context::complete (r=0, this=0x555943b55f90) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/include/Context.h:77 #7  MDSContext::complete (this=0x555943b55f90, r=0) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/MDSContext.cc:29 #8  0x0000555555806a8f in MDSIOContextBase::complete (this=0x555943b55f90, r=0) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/MDSContext.cc:114 #9  0x0000555555806d54 in MDSLogContextBase::complete (this=<optimized out>, r=0) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/MDSContext.cc:123 #10 0x00007fffef02062d in Finisher::finisher_thread_entry() () from target:/usr/lib64/ceph/libceph-common.so.0
#11 0x00007fffecd7d14a in start_thread () from target:/lib64/libpthread.so.0
#12 0x00007fffeb896dc3 in clone () from target:/lib64/libc.so.6

The cluster is otherwise healthy - I don't see any performance issues with other ceph processes (like OSDs).  I scaled snapshots back, so there is only one snapshot in the whole file system when this problem is happening.  From the performance counters, it looks like the MDS is doing about 20 unlink and 3 rmdir operations/s on average when this happens.  From the stack trace it seems like somehow the unlink operations + the snapshots don't play well together?

Here are latencies from these "bad" periods (in seconds):

MDS:
    req/s:  3916.6,  reply/s:  3917.0,  lat:  0.3086
Latencies:
    req_create      :  req/s:  426.4,  lat:  0.5452
    req_getattr     :  req/s:   58.3,  lat:  0.0269
    req_lookup      :  req/s: 3340.6,  lat:  0.2881
    req_mkdir       :  req/s:    0.6,  lat:  0.0417
    req_open        :  req/s:   51.1,  lat:  0.1990
    req_readdir     :  req/s:    3.1,  lat:  0.0338
    req_rename      :  req/s:    2.2,  lat:  0.3481
    req_rmdir       :  req/s:    3.3,  lat:  0.0545
    req_rmxattr     :  req/s:    0.2,  lat:  0.3395
    req_setattr     :  req/s:    0.4,  lat:  0.1154
    req_setfilelock :  req/s:    8.8,  lat:  0.0672
    req_unlink      :  req/s:   21.5,  lat:  0.0113

and for a typical good period (a random sample):

MDS:
    req/s:  3101.7,  reply/s:  3101.7,  lat:  0.0001
Latencies:
    req_create      :  req/s:   12.2,  lat:  0.0008
    req_getattr     :  req/s:   83.4,  lat:  0.0000
    req_lookup      :  req/s: 1102.1,  lat:  0.0000
    req_open        :  req/s:    8.3,  lat:  0.0105
    req_readdir     :  req/s:    1.0,  lat:  0.0010
    req_rename      :  req/s:    1.8,  lat:  0.0001
    req_setfilelock :  req/s: 1890.4,  lat:  0.0000
    req_unlink      :  req/s:    2.6,  lat:  0.0001

Has anyone come across this issue or have any hints/suggestions?

Andras

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux