Re: ceph-fuse using excessive memory

"Yan, Zheng" <ukernel@xxxxxxxxx> · Fri, 7 Sep 2018 11:58:53 +0800

Could you please try make ceph-fuse use simple messenger (add "ms type
= simple" to client section of ceph.conf).

Regards
Yan, Zheng

On Wed, Sep 5, 2018 at 10:09 PM Sage Weil <sage@xxxxxxxxxxxx> wrote:
>
> On Wed, 5 Sep 2018, Andras Pataki wrote:
> > Hi cephers,
> >
> > Every so often we have a ceph-fuse process that grows to rather large size (up
> > to eating up the whole memory of the machine).  Here is an example of a 200GB
> > RSS size ceph-fuse instance:
> >
> > # ceph daemon /var/run/ceph/ceph-client.admin.asok dump_mempools
> > {
> >     "bloom_filter": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "bluestore_alloc": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "bluestore_cache_data": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "bluestore_cache_onode": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "bluestore_cache_other": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "bluestore_fsck": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "bluestore_txc": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "bluestore_writing_deferred": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "bluestore_writing": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "bluefs": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "buffer_anon": {
> >         "items": 51534897,
> >         "bytes": 207321872398
> >     },
> >     "buffer_meta": {
> >         "items": 64,
> >         "bytes": 5632
> >     },
> >     "osd": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "osd_mapbl": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "osd_pglog": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "osdmap": {
> >         "items": 28593,
> >         "bytes": 431872
> >     },
> >     "osdmap_mapping": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "pgmap": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "mds_co": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "unittest_1": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "unittest_2": {
> >         "items": 0,
> >         "bytes": 0
> >     },
> >     "total": {
> >         "items": 51563554,
> >         "bytes": 207322309902
> >     }
> > }
> >
> > The general cache size looks like this (if it is helpful I can put a whole
> > cache dump somewhere):
> >
> > # ceph daemon /var/run/ceph/ceph-client.admin.asok dump_cache | grep path | wc
> > -l
> > 84085
> > # ceph daemon /var/run/ceph/ceph-client.admin.asok dump_cache | grep name | wc
> > -l
> > 168186
> >
> > Any ideas what 'buffer_anon' is and what could be eating up the 200GB of
> > RAM?
>
> buffer_anon is memory consumed by the bufferlist class that hasn't been
> explicitly put into a separate mempool category.  The question is
> where/why are buffers getting pinned in memory.  Can you dump the
> perfcounters?  That might give some hint.
>
> My guess is a leak, or a problem with the ObjectCacher code that is
> preventing it from timming older buffers.
>
> How reproducible is the situation?  Any idea what workloads trigger it?
>
> Thanks!
> sage
>
> >
> > We are running with a few ceph-fuse specific parameters increased in
> > ceph.conf:
> >
> >    # Description:  Set the number of inodes that the client keeps in
> >    the metadata cache.
> >    # Default:      16384
> >    client_cache_size = 262144
> >
> >    # Description:  Set the maximum number of dirty bytes in the object
> >    cache.
> >    # Default:      104857600 (100MB)
> >    client_oc_max_dirty = 536870912
> >
> >    # Description:  Set the maximum number of objects in the object cache.
> >    # Default:      1000
> >    client_oc_max_objects = 8192
> >
> >    # Description:  Set how many bytes of data will the client cache.
> >    # Default:      209715200 (200 MB)
> >    client_oc_size = 2147483640
> >
> >    # Description:  Set the maximum number of bytes that the kernel
> >    reads ahead for future read operations. Overridden by the
> >    client_readahead_max_periods setting.
> >    # Default:      0 (unlimited)
> >    #client_readahead_max_bytes = 67108864
> >
> >    # Description:  Set the number of file layout periods (object size *
> >    number of stripes) that the kernel reads ahead. Overrides the
> >    client_readahead_max_bytes setting.
> >    # Default:      4
> >    client_readahead_max_periods = 64
> >
> >    # Description:  Set the minimum number bytes that the kernel reads
> >    ahead.
> >    # Default:      131072 (128KB)
> >    client_readahead_min = 4194304
> >
> >
> > We are running a 12.2.7 ceph cluster, and the cluster is otherwise healthy.
> >
> > Any hints would be appreciated.  Thanks,
> >
> > Andras
> >
> > _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com