Upgraded to 14.2.7, doesn't appear to have affected the behavior. As requested: ~$ ceph tell mds.mds1 heap stats 2020-02-10 16:52:44.313 7fbda2cae700 0 client.59208005 ms_handle_reset on v2:x.x.x.x:6800/3372494505 2020-02-10 16:52:44.337 7fbda3cb0700 0 client.59249562 ms_handle_reset on v2:x.x.x.x:6800/3372494505 mds.mds1 tcmalloc heap stats:------------------------------------------------ MALLOC: 50000388656 (47684.1 MiB) Bytes in use by application MALLOC: + 0 ( 0.0 MiB) Bytes in page heap freelist MALLOC: + 174879528 ( 166.8 MiB) Bytes in central cache freelist MALLOC: + 14511680 ( 13.8 MiB) Bytes in transfer cache freelist MALLOC: + 14089320 ( 13.4 MiB) Bytes in thread cache freelists MALLOC: + 90534048 ( 86.3 MiB) Bytes in malloc metadata MALLOC: ------------ MALLOC: = 50294403232 (47964.5 MiB) Actual memory used (physical + swap) MALLOC: + 50987008 ( 48.6 MiB) Bytes released to OS (aka unmapped) MALLOC: ------------ MALLOC: = 50345390240 (48013.1 MiB) Virtual address space used MALLOC: MALLOC: 260018 Spans in use MALLOC: 20 Thread heaps in use MALLOC: 8192 Tcmalloc page size ------------------------------------------------ Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). Bytes released to the OS take up virtual address space but no physical memory. ~$ ceph tell mds.mds1 heap release 2020-02-10 16:52:47.205 7f037eff5700 0 client.59249625 ms_handle_reset on v2:x.x.x.x:6800/3372494505 2020-02-10 16:52:47.237 7f037fff7700 0 client.59249634 ms_handle_reset on v2:x.x.x.x:6800/3372494505 mds.mds1 releasing free RAM back to system. The pools over 15 minutes or so: ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 2045, "bytes": 3069493686 } ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 2445, "bytes": 3111162538 } ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 7850, "bytes": 7658678767 } ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 12274, "bytes": 11436728978 } ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 13747, "bytes": 11539478519 } ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 14615, "bytes": 13859676992 } ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 23267, "bytes": 22290063830 } ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 44944, "bytes": 40726959425 } And one about a minute after the heap release showing continued growth: ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 50694, "bytes": 47343942094 } This is on a single active MDS with 2 standbys, scan for about a million files with about 20 parallel threads on two clients, open and read each if it exists. On Wed, Jan 22, 2020 at 8:25 AM John Madden <jmadden.com@xxxxxxxxx> wrote: > > > Couldn't John confirm that this is the issue by checking the heap stats and triggering the release via > > > > ceph tell mds.mds1 heap stats > > ceph tell mds.mds1 heap release > > > > (this would be much less disruptive than restarting the MDS) > > That was my first thought as well, but `release` doesn't appear to do > anything in this case. > > John _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx