On Wed, 5 Sep 2018, Andras Pataki wrote: > Hi cephers, > > Every so often we have a ceph-fuse process that grows to rather large size (up > to eating up the whole memory of the machine). Here is an example of a 200GB > RSS size ceph-fuse instance: > > # ceph daemon /var/run/ceph/ceph-client.admin.asok dump_mempools > { > "bloom_filter": { > "items": 0, > "bytes": 0 > }, > "bluestore_alloc": { > "items": 0, > "bytes": 0 > }, > "bluestore_cache_data": { > "items": 0, > "bytes": 0 > }, > "bluestore_cache_onode": { > "items": 0, > "bytes": 0 > }, > "bluestore_cache_other": { > "items": 0, > "bytes": 0 > }, > "bluestore_fsck": { > "items": 0, > "bytes": 0 > }, > "bluestore_txc": { > "items": 0, > "bytes": 0 > }, > "bluestore_writing_deferred": { > "items": 0, > "bytes": 0 > }, > "bluestore_writing": { > "items": 0, > "bytes": 0 > }, > "bluefs": { > "items": 0, > "bytes": 0 > }, > "buffer_anon": { > "items": 51534897, > "bytes": 207321872398 > }, > "buffer_meta": { > "items": 64, > "bytes": 5632 > }, > "osd": { > "items": 0, > "bytes": 0 > }, > "osd_mapbl": { > "items": 0, > "bytes": 0 > }, > "osd_pglog": { > "items": 0, > "bytes": 0 > }, > "osdmap": { > "items": 28593, > "bytes": 431872 > }, > "osdmap_mapping": { > "items": 0, > "bytes": 0 > }, > "pgmap": { > "items": 0, > "bytes": 0 > }, > "mds_co": { > "items": 0, > "bytes": 0 > }, > "unittest_1": { > "items": 0, > "bytes": 0 > }, > "unittest_2": { > "items": 0, > "bytes": 0 > }, > "total": { > "items": 51563554, > "bytes": 207322309902 > } > } > > The general cache size looks like this (if it is helpful I can put a whole > cache dump somewhere): > > # ceph daemon /var/run/ceph/ceph-client.admin.asok dump_cache | grep path | wc > -l > 84085 > # ceph daemon /var/run/ceph/ceph-client.admin.asok dump_cache | grep name | wc > -l > 168186 > > Any ideas what 'buffer_anon' is and what could be eating up the 200GB of > RAM? buffer_anon is memory consumed by the bufferlist class that hasn't been explicitly put into a separate mempool category. The question is where/why are buffers getting pinned in memory. Can you dump the perfcounters? That might give some hint. My guess is a leak, or a problem with the ObjectCacher code that is preventing it from timming older buffers. How reproducible is the situation? Any idea what workloads trigger it? Thanks! sage > > We are running with a few ceph-fuse specific parameters increased in > ceph.conf: > > # Description: Set the number of inodes that the client keeps in > the metadata cache. > # Default: 16384 > client_cache_size = 262144 > > # Description: Set the maximum number of dirty bytes in the object > cache. > # Default: 104857600 (100MB) > client_oc_max_dirty = 536870912 > > # Description: Set the maximum number of objects in the object cache. > # Default: 1000 > client_oc_max_objects = 8192 > > # Description: Set how many bytes of data will the client cache. > # Default: 209715200 (200 MB) > client_oc_size = 2147483640 > > # Description: Set the maximum number of bytes that the kernel > reads ahead for future read operations. Overridden by the > client_readahead_max_periods setting. > # Default: 0 (unlimited) > #client_readahead_max_bytes = 67108864 > > # Description: Set the number of file layout periods (object size * > number of stripes) that the kernel reads ahead. Overrides the > client_readahead_max_bytes setting. > # Default: 4 > client_readahead_max_periods = 64 > > # Description: Set the minimum number bytes that the kernel reads > ahead. > # Default: 131072 (128KB) > client_readahead_min = 4194304 > > > We are running a 12.2.7 ceph cluster, and the cluster is otherwise healthy. > > Any hints would be appreciated. Thanks, > > Andras > >
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com