It's very easy reproduce this issue in my cluster. With single MDS and single client , I prepare 1 million files in a folder. Then I read the files one by one. Here is the log with log level20: 2018-11-26 17:46:02.215734 7f40535fe700 7 src/mds/MDCache.cc:6456 mds.0.cache trim_lru trimming 0 items from LRU size=510845 mid=322756 pintail=4892 pinned=49765 ...... 2018-11-26 17:46:02.215745 7f40535fe700 12 src/mds/MDCache.cc:6644 mds.0.cache trim_dentry [dentry #0x1/tmp/000879266.txt [2,head] auth (dversion lock) pv=0 v=116355 inode=0x7f401b2fc400 state=1073741824 0x7f4012c3e600] 2018-11-26 17:46:02.215758 7f40535fe700 12 src/mds/MDCache.cc:6653 mds.0.cache in container [dir 0x1 / [2,head] auth v=4401399 cv=4173802/4173802 REP dir_auth=0 state=1610612738|complete f(v0 m2018-11-26 11:19:33.739629 6=1+5) n(v437 rc2018-11-26 11:25:19.121540 b392072138355 1833246=1833191+55) hs=6+0,ss=0+0 dirty=1 | child=1 subtree=1 subtreetemp=0 dirty=1 waiter=0 authpin=0 0x7f404bc4d400] 2018-11-26 17:46:02.215775 7f40535fe700 15 src/mds/MDCache.cc:6776 mds.0.cache trim_inode [inode 0x100000d6aa8 [2,head] /tmp/000879266.txt auth v115474 s=65536 n(v0 b65536 1=1+0) (iversion lock) 0x7f401b2fc400] 2018-11-26 17:46:02.215784 7f40535fe700 12 src/mds/CDir.cc:583 mds.0.cache.dir(0x10000000000.101110011*) unlink_inode [dentry #0x1/tmp/000879266.txt [2,head] auth (dversion lock) pv=0 v=116355 inode=0x7f401b2fc400 state=1073741824 0x7f4012c3e600] [inode 0x100000d6aa8 [2,head] /tmp/000879266.txt auth v115474 s=65536 n(v0 b65536 1=1+0) (iversion lock) 0x7f401b2fc400] 2018-11-26 17:46:02.215792 7f40535fe700 14 src/mds/MDCache.cc:291 mds.0.cache remove_inode [inode 0x100000d6aa8 [2,head] #100000d6aa8 auth v115474 s=65536 n(v0 b65536 1=1+0) (iversion lock) 0x7f401b2fc400] 2018-11-26 17:46:02.215807 7f40535fe700 12 src/mds/CDir.cc:456 mds.0.cache.dir(0x10000000000.101110011*) remove_dentry [dentry #0x1/tmp/000879266.txt [2,head] auth NULL (dversion lock) pv=0 v=116355 inode=0 state=1073741824 0x7f4012c3e600] ...... [marvin] trim_dentry runs about 91555 times, it take about 2ms each time each time in average ,they consume about 3s totally. ...... [marvin] The two trim_dentry below duration is the biggest, the duration is about 30ms. 2018-11-26 17:46:02.369953 7f40535fe700 12 src/mds/MDCache.cc:6644 mds.0.cache trim_dentry [dentry #0x1/tmp/000443283.txt [2,head] auth (dversion lock) pv=0 v=116541 inode=0x7f401db14f00 state=1073741824 0x7f40198a7200] 2018-11-26 17:46:02.369956 7f40535fe700 12 src/mds/MDCache.cc:6653 mds.0.cache in container [dir 0x1 / [2,head] auth v=4401399 cv=4173802/4173802 REP dir_auth=0 state=1610612738|complete f(v0 m2018-11-26 11:19:33.739629 6=1+5) n(v437 rc2018-11-26 11:25:19.121540 b392072138355 1833246=1833191+55) hs=6+0,ss=0+0 dirty=1 | child=1 subtree=1 subtreetemp=0 dirty=1 waiter=0 authpin=0 0x7f404bc4d400] 2018-11-26 17:46:02.369961 7f40535fe700 15 src/mds/MDCache.cc:6776 mds.0.cache trim_inode [inode 0x1000006c397 [2,head] /tmp/000443283.txt auth v101840 s=65536 n(v0 b65536 1=1+0) (iversion lock) 0x7f401db14f00] 2018-11-26 17:46:02.369964 7f40535fe700 12 src/mds/CDir.cc:583 mds.0.cache.dir(0x10000000000.101110101*) unlink_inode [dentry #0x1/tmp/000443283.txt [2,head] auth (dversion lock) pv=0 v=116541 inode=0x7f401db14f00 state=1073741824 0x7f40198a7200] [inode 0x1000006c397 [2,head] /tmp/000443283.txt auth v101840 s=65536 n(v0 b65536 1=1+0) (iversion lock) 0x7f401db14f00] 2018-11-26 17:46:02.369971 7f40535fe700 14 src/mds/MDCache.cc:291 mds.0.cache remove_inode [inode 0x1000006c397 [2,head] #1000006c397 auth v101840 s=65536 n(v0 b65536 1=1+0) (iversion lock) 0x7f401db14f00] 2018-11-26 17:46:02.369974 7f40535fe700 12 src/mds/CDir.cc:456 mds.0.cache.dir(0x10000000000.101110101*) remove_dentry [dentry #0x1/tmp/000443283.txt [2,head] auth NULL (dversion lock) pv=0 v=116541 inode=0 state=1073741824 0x7f40198a7200] 2018-11-26 17:46:02.393052 7f40535fe700 12 src/mds/MDCache.cc:6644 mds.0.cache trim_dentry [dentry #0x1/tmp/000351925.txt [2,head] auth (dversion lock) pv=0 v=116541 inode=0x7f401dba3f00 state=1073741824 0x7f401991d200] ...... 2018-11-26 17:46:05.612477 7f40535fe700 7 src/mds/MDCache.cc:6502 mds.0.cache trim_lru trimmed 91555 items ...... 2018-11-26 17:46:05.612722 7f40535fe700 2 src/mds/MDCache.cc:7460 mds.0.cache check_memory_usage total 10990068, rss 1629564, heap 10930576, baseline 559504, buffers 0, 20067 / 419292 inodes have caps, 20117 caps, 0.0479785 caps per inode 2018-11-26 17:46:05.612744 7f40535fe700 10 src/mds/MDLog.cc:589 mds.0.log trim 132 / 128 segments, 96963 / -1 events, 0 (0) expiring, 3 (2200) expired 2018-11-26 17:46:05.612749 7f40535fe700 5 src/mds/MDLog.cc:643 mds.0.log trim already expired segment 4864202/9082625714, 1024 events 2018-11-26 17:46:05.612752 7f40535fe700 5 src/mds/MDLog.cc:643 mds.0.log trim already expired segment 4865226/9084582156, 152 events 2018-11-26 17:46:05.612757 7f40535fe700 5 src/mds/MDLog.cc:643 mds.0.log trim already expired segment 4865378/9084873509, 1024 events On Tue, Nov 27, 2018 at 12:28 AM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > Okay, that sounds like something else is happening then. You might try > turning up the debug to 20 and seeing what is happening during those 4 > seconds, as that's definitely more time than it should take. > -Greg > > > On Mon, Nov 26, 2018 at 10:37 AM Marvin Zhang <fanzier@xxxxxxxxx> wrote: > > > > I tested it in 12.2.8, and the debug_mds log level is 10/10 > > I don't config the mds_cache_memory_limit, I think it will be 1GB. > > Here is the log(I set MDS log as 10) > > .... > > 2018-11-23 02:01:02.995067 7f40535fe700 7 src/mds/MDCache.cc:6517 > > mds.0.cache trim bytes_used=1GB limit=1GB reservation=0.05% count=0 > > 2018-11-23 02:01:02.995084 7f40535fe700 7 src/mds/MDCache.cc:6456 > > mds.0.cache trim_lru trimming 0 items from LRU size=694884 mid=472796 > > pintail=4942 pinned=19460 > > 2018-11-23 02:01:02.995409 7f404e7fb700 10 src/mds/MDSContext.cc:51 > > MDSIOContextBase::complete: 21C_IO_Dir_OMAP_Fetched > > 2018-11-23 02:01:04.207192 7f40535fe700 7 src/mds/MDCache.cc:6502 > > mds.0.cache trim_lru trimmed 262781 items > > 2018-11-23 02:01:04.207211 7f40535fe700 10 src/mds/MDCache.cc:7426 > > mds.0.cache trim_client_leases > > .... > > > > On Mon, Nov 26, 2018 at 10:43 PM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > > > > > On Mon, Nov 26, 2018 at 7:00 AM Marvin Zhang <fanzier@xxxxxxxxx> wrote: > > > > > > > > Hi CephFs Experts, > > > > I found that MDCache::trim_lru() will take more than 2s sometime, is > > > > it as expected? And it seems like to hold mds_lock at that time, which > > > > cause other client request can't be processed. > > > > Thanks, > > > > Marvin > > > > > > It's definitely not supposed to, but we've started noticing some > > > performance issues with extremely large MDS cache sizes. How much > > > memory did you give the cache, and what is your system configuration?