Re: MDS: obscene buffer_anon memory use when scanning lots of files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Upgraded to 14.2.7, doesn't appear to have affected the behavior. As requested:

~$ ceph tell mds.mds1 heap stats
2020-02-10 16:52:44.313 7fbda2cae700  0 client.59208005
ms_handle_reset on v2:x.x.x.x:6800/3372494505
2020-02-10 16:52:44.337 7fbda3cb0700  0 client.59249562
ms_handle_reset on v2:x.x.x.x:6800/3372494505
mds.mds1 tcmalloc heap stats:------------------------------------------------
MALLOC:    50000388656 (47684.1 MiB) Bytes in use by application
MALLOC: +            0 (    0.0 MiB) Bytes in page heap freelist
MALLOC: +    174879528 (  166.8 MiB) Bytes in central cache freelist
MALLOC: +     14511680 (   13.8 MiB) Bytes in transfer cache freelist
MALLOC: +     14089320 (   13.4 MiB) Bytes in thread cache freelists
MALLOC: +     90534048 (   86.3 MiB) Bytes in malloc metadata
MALLOC:   ------------
MALLOC: =  50294403232 (47964.5 MiB) Actual memory used (physical + swap)
MALLOC: +     50987008 (   48.6 MiB) Bytes released to OS (aka unmapped)
MALLOC:   ------------
MALLOC: =  50345390240 (48013.1 MiB) Virtual address space used
MALLOC:
MALLOC:         260018              Spans in use
MALLOC:             20              Thread heaps in use
MALLOC:           8192              Tcmalloc page size
------------------------------------------------
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.

~$ ceph tell mds.mds1 heap release
2020-02-10 16:52:47.205 7f037eff5700  0 client.59249625
ms_handle_reset on v2:x.x.x.x:6800/3372494505
2020-02-10 16:52:47.237 7f037fff7700  0 client.59249634
ms_handle_reset on v2:x.x.x.x:6800/3372494505
mds.mds1 releasing free RAM back to system.

The pools over 15 minutes or so:

~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
  "items": 2045,
  "bytes": 3069493686
}
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
  "items": 2445,
  "bytes": 3111162538
}
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
  "items": 7850,
  "bytes": 7658678767
}
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
  "items": 12274,
  "bytes": 11436728978
}
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
  "items": 13747,
  "bytes": 11539478519
}
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
  "items": 14615,
  "bytes": 13859676992
}
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
  "items": 23267,
  "bytes": 22290063830
}
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
  "items": 44944,
  "bytes": 40726959425
}

And one about a minute after the heap release showing continued growth:

~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
  "items": 50694,
  "bytes": 47343942094
}

This is on a single active MDS with 2 standbys, scan for about a
million files with about 20 parallel threads on two clients, open and
read each if it exists.


On Wed, Jan 22, 2020 at 8:25 AM John Madden <jmadden.com@xxxxxxxxx> wrote:
>
> > Couldn't John confirm that this is the issue by checking the heap stats and triggering the release via
> >
> >   ceph tell mds.mds1 heap stats
> >   ceph tell mds.mds1 heap release
> >
> > (this would be much less disruptive than restarting the MDS)
>
> That was my first thought as well, but `release` doesn't appear to do
> anything in this case.
>
> John
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux