Re: ceph-fuse using excessive memory

"Yan, Zheng" <ukernel@xxxxxxxxx> · Tue, 25 Sep 2018 10:06:20 +0800



On Tue, Sep 25, 2018 at 2:23 AM Andras Pataki
<apataki@xxxxxxxxxxxxxxxxxxxxx> wrote:
>
> The whole cluster, including ceph-fuse is version 12.2.7.
>

If this issue happens again, please set "debug_objectcacher" option of
ceph-fuse to 15 (for 30 seconds) and set ceph-fuse log to us

Regards
Yan, Zheng


> Andras
>
> On 9/24/18 6:27 AM, Yan, Zheng wrote:
> > On Fri, Sep 21, 2018 at 5:40 AM Andras Pataki
> > <apataki@xxxxxxxxxxxxxxxxxxxxx> wrote:
> >> I've done some more experiments playing with client config parameters,
> >> and it seems like the the client_oc_size parameter is very correlated to
> >> how big ceph-fuse grows.  With its default value of 200MB, ceph-fuse
> >> gets to about 22GB of RSS, with our previous client_oc_size value of
> >> 2GB, the ceph-fuse process grows to 211GB. After this size is reached,
> >> its memory usage levels out.  So it seems like there is an issue
> >> accounting for memory for the client cache - whatever client_oc_size is
> >> set to, about 100 times more memory gets used in our case at least.
> >>
> > ceph-fuse version ?
> >
> >> Andras
> >>
> >> On 9/19/18 6:06 PM, Andras Pataki wrote:
> >>> Hi Zheng,
> >>>
> >>> It looks like the memory growth happens even with the simple messenger:
> >>>
> >>> [root@worker1032 ~]# ceph daemon /var/run/ceph/ceph-client.admin.asok
> >>> config get ms_type
> >>> {
> >>>      "ms_type": "simple"
> >>> }
> >>> [root@worker1032 ~]# ps -auxwww | grep ceph-fuse
> >>> root      179133 82.2 13.5 77281896 71644120 ?   Sl   12:48 258:09
> >>> ceph-fuse --id=admin --conf=/etc/ceph/ceph.conf /mnt/ceph -o
> >>> rw,fsname=ceph,dev,suid
> >>> [root@worker1032 ~]# ceph daemon /var/run/ceph/ceph-client.admin.asok
> >>> dump_mempools
> >>> {
> >>> ... snip ...
> >>>      "buffer_anon": {
> >>>          "items": 16753337,
> >>>          "bytes": 68782648777
> >>>      },
> >>>      "buffer_meta": {
> >>>          "items": 771,
> >>>          "bytes": 67848
> >>>      },
> >>> ... snip ...
> >>>      "osdmap": {
> >>>          "items": 28582,
> >>>          "bytes": 431840
> >>>      },
> >>> ... snip ...
> >>>
> >>>      "total": {
> >>>          "items": 16782690,
> >>>          "bytes": 68783148465
> >>>      }
> >>> }
> >>> Andras
> >>>
> >>>
> >>> On 9/6/18 11:58 PM, Yan, Zheng wrote:
> >>>> Could you please try make ceph-fuse use simple messenger (add "ms type
> >>>> = simple" to client section of ceph.conf).
> >>>>
> >>>> Regards
> >>>> Yan, Zheng
> >>>>
> >>>>
> >>>>
> >>>> On Wed, Sep 5, 2018 at 10:09 PM Sage Weil <sage@xxxxxxxxxxxx> wrote:
> >>>>> On Wed, 5 Sep 2018, Andras Pataki wrote:
> >>>>>> Hi cephers,
> >>>>>>
> >>>>>> Every so often we have a ceph-fuse process that grows to rather
> >>>>>> large size (up
> >>>>>> to eating up the whole memory of the machine).  Here is an example
> >>>>>> of a 200GB
> >>>>>> RSS size ceph-fuse instance:
> >>>>>>
> >>>>>> # ceph daemon /var/run/ceph/ceph-client.admin.asok dump_mempools
> >>>>>> {
> >>>>>>       "bloom_filter": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "bluestore_alloc": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "bluestore_cache_data": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "bluestore_cache_onode": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "bluestore_cache_other": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "bluestore_fsck": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "bluestore_txc": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "bluestore_writing_deferred": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "bluestore_writing": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "bluefs": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "buffer_anon": {
> >>>>>>           "items": 51534897,
> >>>>>>           "bytes": 207321872398
> >>>>>>       },
> >>>>>>       "buffer_meta": {
> >>>>>>           "items": 64,
> >>>>>>           "bytes": 5632
> >>>>>>       },
> >>>>>>       "osd": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "osd_mapbl": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "osd_pglog": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "osdmap": {
> >>>>>>           "items": 28593,
> >>>>>>           "bytes": 431872
> >>>>>>       },
> >>>>>>       "osdmap_mapping": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "pgmap": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "mds_co": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "unittest_1": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "unittest_2": {
> >>>>>>           "items": 0,
> >>>>>>           "bytes": 0
> >>>>>>       },
> >>>>>>       "total": {
> >>>>>>           "items": 51563554,
> >>>>>>           "bytes": 207322309902
> >>>>>>       }
> >>>>>> }
> >>>>>>
> >>>>>> The general cache size looks like this (if it is helpful I can put
> >>>>>> a whole
> >>>>>> cache dump somewhere):
> >>>>>>
> >>>>>> # ceph daemon /var/run/ceph/ceph-client.admin.asok dump_cache |
> >>>>>> grep path | wc
> >>>>>> -l
> >>>>>> 84085
> >>>>>> # ceph daemon /var/run/ceph/ceph-client.admin.asok dump_cache |
> >>>>>> grep name | wc
> >>>>>> -l
> >>>>>> 168186
> >>>>>>
> >>>>>> Any ideas what 'buffer_anon' is and what could be eating up the
> >>>>>> 200GB of
> >>>>>> RAM?
> >>>>> buffer_anon is memory consumed by the bufferlist class that hasn't been
> >>>>> explicitly put into a separate mempool category.  The question is
> >>>>> where/why are buffers getting pinned in memory.  Can you dump the
> >>>>> perfcounters?  That might give some hint.
> >>>>>
> >>>>> My guess is a leak, or a problem with the ObjectCacher code that is
> >>>>> preventing it from timming older buffers.
> >>>>>
> >>>>> How reproducible is the situation?  Any idea what workloads trigger it?
> >>>>>
> >>>>> Thanks!
> >>>>> sage
> >>>>>
> >>>>>> We are running with a few ceph-fuse specific parameters increased in
> >>>>>> ceph.conf:
> >>>>>>
> >>>>>>      # Description:  Set the number of inodes that the client keeps in
> >>>>>>      the metadata cache.
> >>>>>>      # Default:      16384
> >>>>>>      client_cache_size = 262144
> >>>>>>
> >>>>>>      # Description:  Set the maximum number of dirty bytes in the
> >>>>>> object
> >>>>>>      cache.
> >>>>>>      # Default:      104857600 (100MB)
> >>>>>>      client_oc_max_dirty = 536870912
> >>>>>>
> >>>>>>      # Description:  Set the maximum number of objects in the object
> >>>>>> cache.
> >>>>>>      # Default:      1000
> >>>>>>      client_oc_max_objects = 8192
> >>>>>>
> >>>>>>      # Description:  Set how many bytes of data will the client cache.
> >>>>>>      # Default:      209715200 (200 MB)
> >>>>>>      client_oc_size = 2147483640
> >>>>>>
> >>>>>>      # Description:  Set the maximum number of bytes that the kernel
> >>>>>>      reads ahead for future read operations. Overridden by the
> >>>>>>      client_readahead_max_periods setting.
> >>>>>>      # Default:      0 (unlimited)
> >>>>>>      #client_readahead_max_bytes = 67108864
> >>>>>>
> >>>>>>      # Description:  Set the number of file layout periods (object
> >>>>>> size *
> >>>>>>      number of stripes) that the kernel reads ahead. Overrides the
> >>>>>>      client_readahead_max_bytes setting.
> >>>>>>      # Default:      4
> >>>>>>      client_readahead_max_periods = 64
> >>>>>>
> >>>>>>      # Description:  Set the minimum number bytes that the kernel reads
> >>>>>>      ahead.
> >>>>>>      # Default:      131072 (128KB)
> >>>>>>      client_readahead_min = 4194304
> >>>>>>
> >>>>>>
> >>>>>> We are running a 12.2.7 ceph cluster, and the cluster is otherwise
> >>>>>> healthy.
> >>>>>>
> >>>>>> Any hints would be appreciated.  Thanks,
> >>>>>>
> >>>>>> Andras
> >>>>>>
> >>>>>> _______________________________________________
> >>>>> ceph-users mailing list
> >>>>> ceph-users@xxxxxxxxxxxxxx
> >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com