Thinks for the log. I think it's caused by http://tracker.ceph.com/issues/36192 Regards Yan, Zheng On Wed, Sep 26, 2018 at 1:51 AM Andras Pataki <apataki@xxxxxxxxxxxxxxxxxxxxx> wrote: > > Hi Zheng, > > Here is a debug dump: > https://users.flatironinstitute.org/apataki/public_www/7f0011f676112cd4/ > I have also included some other corresponding information (cache dump, > mempool dump, perf dump and ceph.conf). This corresponds to a 100GB > ceph-fuse process while the client code is running. I can reproduce > this issue at will in about 6 to 8 hours of running one of our > scientific jobs - and I can also run a more instrumented/patched/etc. > code to try. > > Andras > > > On 9/24/18 10:06 PM, Yan, Zheng wrote: > > On Tue, Sep 25, 2018 at 2:23 AM Andras Pataki > > <apataki@xxxxxxxxxxxxxxxxxxxxx> wrote: > >> The whole cluster, including ceph-fuse is version 12.2.7. > >> > > If this issue happens again, please set "debug_objectcacher" option of > > ceph-fuse to 15 (for 30 seconds) and set ceph-fuse log to us > > > > Regards > > Yan, Zheng > > > > > >> Andras > >> > >> On 9/24/18 6:27 AM, Yan, Zheng wrote: > >>> On Fri, Sep 21, 2018 at 5:40 AM Andras Pataki > >>> <apataki@xxxxxxxxxxxxxxxxxxxxx> wrote: > >>>> I've done some more experiments playing with client config parameters, > >>>> and it seems like the the client_oc_size parameter is very correlated to > >>>> how big ceph-fuse grows. With its default value of 200MB, ceph-fuse > >>>> gets to about 22GB of RSS, with our previous client_oc_size value of > >>>> 2GB, the ceph-fuse process grows to 211GB. After this size is reached, > >>>> its memory usage levels out. So it seems like there is an issue > >>>> accounting for memory for the client cache - whatever client_oc_size is > >>>> set to, about 100 times more memory gets used in our case at least. > >>>> > >>> ceph-fuse version ? > >>> > >>>> Andras > >>>> > >>>> On 9/19/18 6:06 PM, Andras Pataki wrote: > >>>>> Hi Zheng, > >>>>> > >>>>> It looks like the memory growth happens even with the simple messenger: > >>>>> > >>>>> [root@worker1032 ~]# ceph daemon /var/run/ceph/ceph-client.admin.asok > >>>>> config get ms_type > >>>>> { > >>>>> "ms_type": "simple" > >>>>> } > >>>>> [root@worker1032 ~]# ps -auxwww | grep ceph-fuse > >>>>> root 179133 82.2 13.5 77281896 71644120 ? Sl 12:48 258:09 > >>>>> ceph-fuse --id=admin --conf=/etc/ceph/ceph.conf /mnt/ceph -o > >>>>> rw,fsname=ceph,dev,suid > >>>>> [root@worker1032 ~]# ceph daemon /var/run/ceph/ceph-client.admin.asok > >>>>> dump_mempools > >>>>> { > >>>>> ... snip ... > >>>>> "buffer_anon": { > >>>>> "items": 16753337, > >>>>> "bytes": 68782648777 > >>>>> }, > >>>>> "buffer_meta": { > >>>>> "items": 771, > >>>>> "bytes": 67848 > >>>>> }, > >>>>> ... snip ... > >>>>> "osdmap": { > >>>>> "items": 28582, > >>>>> "bytes": 431840 > >>>>> }, > >>>>> ... snip ... > >>>>> > >>>>> "total": { > >>>>> "items": 16782690, > >>>>> "bytes": 68783148465 > >>>>> } > >>>>> } > >>>>> Andras > >>>>> > >>>>> > >>>>> On 9/6/18 11:58 PM, Yan, Zheng wrote: > >>>>>> Could you please try make ceph-fuse use simple messenger (add "ms type > >>>>>> = simple" to client section of ceph.conf). > >>>>>> > >>>>>> Regards > >>>>>> Yan, Zheng > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Wed, Sep 5, 2018 at 10:09 PM Sage Weil <sage@xxxxxxxxxxxx> wrote: > >>>>>>> On Wed, 5 Sep 2018, Andras Pataki wrote: > >>>>>>>> Hi cephers, > >>>>>>>> > >>>>>>>> Every so often we have a ceph-fuse process that grows to rather > >>>>>>>> large size (up > >>>>>>>> to eating up the whole memory of the machine). Here is an example > >>>>>>>> of a 200GB > >>>>>>>> RSS size ceph-fuse instance: > >>>>>>>> > >>>>>>>> # ceph daemon /var/run/ceph/ceph-client.admin.asok dump_mempools > >>>>>>>> { > >>>>>>>> "bloom_filter": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "bluestore_alloc": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "bluestore_cache_data": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "bluestore_cache_onode": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "bluestore_cache_other": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "bluestore_fsck": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "bluestore_txc": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "bluestore_writing_deferred": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "bluestore_writing": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "bluefs": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "buffer_anon": { > >>>>>>>> "items": 51534897, > >>>>>>>> "bytes": 207321872398 > >>>>>>>> }, > >>>>>>>> "buffer_meta": { > >>>>>>>> "items": 64, > >>>>>>>> "bytes": 5632 > >>>>>>>> }, > >>>>>>>> "osd": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "osd_mapbl": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "osd_pglog": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "osdmap": { > >>>>>>>> "items": 28593, > >>>>>>>> "bytes": 431872 > >>>>>>>> }, > >>>>>>>> "osdmap_mapping": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "pgmap": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "mds_co": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "unittest_1": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "unittest_2": { > >>>>>>>> "items": 0, > >>>>>>>> "bytes": 0 > >>>>>>>> }, > >>>>>>>> "total": { > >>>>>>>> "items": 51563554, > >>>>>>>> "bytes": 207322309902 > >>>>>>>> } > >>>>>>>> } > >>>>>>>> > >>>>>>>> The general cache size looks like this (if it is helpful I can put > >>>>>>>> a whole > >>>>>>>> cache dump somewhere): > >>>>>>>> > >>>>>>>> # ceph daemon /var/run/ceph/ceph-client.admin.asok dump_cache | > >>>>>>>> grep path | wc > >>>>>>>> -l > >>>>>>>> 84085 > >>>>>>>> # ceph daemon /var/run/ceph/ceph-client.admin.asok dump_cache | > >>>>>>>> grep name | wc > >>>>>>>> -l > >>>>>>>> 168186 > >>>>>>>> > >>>>>>>> Any ideas what 'buffer_anon' is and what could be eating up the > >>>>>>>> 200GB of > >>>>>>>> RAM? > >>>>>>> buffer_anon is memory consumed by the bufferlist class that hasn't been > >>>>>>> explicitly put into a separate mempool category. The question is > >>>>>>> where/why are buffers getting pinned in memory. Can you dump the > >>>>>>> perfcounters? That might give some hint. > >>>>>>> > >>>>>>> My guess is a leak, or a problem with the ObjectCacher code that is > >>>>>>> preventing it from timming older buffers. > >>>>>>> > >>>>>>> How reproducible is the situation? Any idea what workloads trigger it? > >>>>>>> > >>>>>>> Thanks! > >>>>>>> sage > >>>>>>> > >>>>>>>> We are running with a few ceph-fuse specific parameters increased in > >>>>>>>> ceph.conf: > >>>>>>>> > >>>>>>>> # Description: Set the number of inodes that the client keeps in > >>>>>>>> the metadata cache. > >>>>>>>> # Default: 16384 > >>>>>>>> client_cache_size = 262144 > >>>>>>>> > >>>>>>>> # Description: Set the maximum number of dirty bytes in the > >>>>>>>> object > >>>>>>>> cache. > >>>>>>>> # Default: 104857600 (100MB) > >>>>>>>> client_oc_max_dirty = 536870912 > >>>>>>>> > >>>>>>>> # Description: Set the maximum number of objects in the object > >>>>>>>> cache. > >>>>>>>> # Default: 1000 > >>>>>>>> client_oc_max_objects = 8192 > >>>>>>>> > >>>>>>>> # Description: Set how many bytes of data will the client cache. > >>>>>>>> # Default: 209715200 (200 MB) > >>>>>>>> client_oc_size = 2147483640 > >>>>>>>> > >>>>>>>> # Description: Set the maximum number of bytes that the kernel > >>>>>>>> reads ahead for future read operations. Overridden by the > >>>>>>>> client_readahead_max_periods setting. > >>>>>>>> # Default: 0 (unlimited) > >>>>>>>> #client_readahead_max_bytes = 67108864 > >>>>>>>> > >>>>>>>> # Description: Set the number of file layout periods (object > >>>>>>>> size * > >>>>>>>> number of stripes) that the kernel reads ahead. Overrides the > >>>>>>>> client_readahead_max_bytes setting. > >>>>>>>> # Default: 4 > >>>>>>>> client_readahead_max_periods = 64 > >>>>>>>> > >>>>>>>> # Description: Set the minimum number bytes that the kernel reads > >>>>>>>> ahead. > >>>>>>>> # Default: 131072 (128KB) > >>>>>>>> client_readahead_min = 4194304 > >>>>>>>> > >>>>>>>> > >>>>>>>> We are running a 12.2.7 ceph cluster, and the cluster is otherwise > >>>>>>>> healthy. > >>>>>>>> > >>>>>>>> Any hints would be appreciated. Thanks, > >>>>>>>> > >>>>>>>> Andras > >>>>>>>> > >>>>>>>> _______________________________________________ > >>>>>>> ceph-users mailing list > >>>>>>> ceph-users@xxxxxxxxxxxxxx > >>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com