回复: 回复: Re: [luminous]OSD memory usage increase when writing^J a lot of data to cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sage,
 
This is the mempool dump of my osd.1
 
ceph daemon osd.0 dump_mempools
{
    "bloom_filter": {
        "items": 0,
        "bytes": 0
    },
    "bluestore_alloc": {
        "items": 10301352,
        "bytes": 10301352
    },
    "bluestore_cache_data": {
        "items": 0,
        "bytes": 0
    },
    "bluestore_cache_onode": {
        "items": 386,
        "bytes": 145136
    },
    "bluestore_cache_other": {
        "items": 91914,
        "bytes": 779970
    },
    "bluestore_fsck": {
        "items": 0,
        "bytes": 0
    },
    "bluestore_txc": {
        "items": 16,
        "bytes": 7040
    },
    "bluestore_writing_deferred": {
        "items": 11,
        "bytes": 7600020
    },
    "bluestore_writing": {
        "items": 0,
        "bytes": 0
    },
    "bluefs": {
        "items": 170,
        "bytes": 5688
    },
    "buffer_anon": {
        "items": 96726,
        "bytes": 5685575
    },
    "buffer_meta": {
        "items": 30,
        "bytes": 1560
    },
    "osd": {
        "items": 72,
        "bytes": 554688
    },
    "osd_mapbl": {
        "items": 0,
        "bytes": 0
    },
    "osd_pglog": {
        "items": 197946,
        "bytes": 35743344
    },
    "osdmap": {
        "items": 8007,
        "bytes": 144024
    },
    "osdmap_mapping": {
        "items": 0,
        "bytes": 0
    },
    "pgmap": {
        "items": 0,
        "bytes": 0
    },
    "mds_co": {
        "items": 0,
        "bytes": 0
    },
    "unittest_1": {
        "items": 0,
        "bytes": 0
    },
    "unittest_2": {
        "items": 0,
        "bytes": 0
    },
    "total": {
        "items": 10696630,
        "bytes": 60968397
    }
}
 
And  the memory use by ps:
ceph      8173 27.3 41.0 1509892 848768 ?      Ssl  Oct31 419:30 /usr/bin/ceph-osd --cluster=ceph -i 0 -f --setuser ceph --setgroup ceph
 
And ceph tell osd.0 heap stats
osd.0 tcmalloc heap stats:------------------------------------------------
MALLOC:      398397808 (  379.9 MiB) Bytes in use by application
MALLOC: +    340647936 (  324.9 MiB) Bytes in page heap freelist
MALLOC: +     32574936 (   31.1 MiB) Bytes in central cache freelist
MALLOC: +     22581232 (   21.5 MiB) Bytes in transfer cache freelist
MALLOC: +     51663048 (   49.3 MiB) Bytes in thread cache freelists
MALLOC: +      3152096 (    3.0 MiB) Bytes in malloc metadata
MALLOC:   ------------
MALLOC: =    849017056 (  809.7 MiB) Actual memory used (physical + swap)
MALLOC: +    128180224 (  122.2 MiB) Bytes released to OS (aka unmapped)
MALLOC:   ------------
MALLOC: =    977197280 (  931.9 MiB) Virtual address space used
MALLOC:
MALLOC:          16765              Spans in use
MALLOC:             32              Thread heaps in use
MALLOC:           8192              Tcmalloc page size
------------------------------------------------
I have run test for about 10hrs writing,so far no oom happened.The osd uses 9xxMB memory max and keep stable at around 800-900MB.
I set blue store cache to 100MB by this config
bluestore_cache_size = 104857600
       bluestore_cache_size_hdd = 104857600
       bluestore_cache_size_ssd = 104857600
       bluestore_cache_kv_max = 103809024     
 
       I am not sure how to calculate if it is right because if i use bluestore_cache_size-512m it would be a negative value.
       Did you mean rocksdb would cost about 512MB memory?
 
2017-11-01
lin.yunfan

发件人:Sage Weil <sage@xxxxxxxxxxxx>
发送时间:2017-11-01 20:11
主题:Re: 回复: Re: [ceph-users] [luminous]OSD memory usage increase when writing^J a lot of data to cluster
收件人:"shadow_lin"<shadow_lin@xxxxxxx>
抄送:"ceph-users"<ceph-users@xxxxxxxxxxxxxx>
 
On Wed, 1 Nov 2017, shadow_lin wrote: 
> Hi Sage, 
> We have tried compiled the latest ceph source code from github. 
> The build is ceph version 12.2.1-249-g42172a4 
> (42172a443183ffe6b36e85770e53fe678db293bf) luminous (stable). 
> The memory problem seems better but the memory usage of osd is still keep 
> increasing as more data are wrote into the rbd image and the memory usage 
> won't drop after the write is stopped. 
>        Could you specify from which commit the memeory bug is fixed? 
 
f60a942023088cbba53a816e6ef846994921cab3 and the prior 2 commits. 
 
If you look at 'cpeh daemon osd.nnn dump_mempools' you can see three 
bluestore pools.  This is what bluestore is using to account for its usage  
so it can know when to trim its cache.  Do those add up to the  
bluestore_cache_size - 512m (for rocskdb) that you have configured? 
 
sage 
 
 
> Thanks 
> 2017-11-01 
>  
> ____________________________________________________________________________ 
> body {font-size:10.5pt; font-family:微软雅黑,serif} lin.yunfan 
>  
> ____________________________________________________________________________ 
>       发件人:Sage Weil <sage@xxxxxxxxxxxx> 
> 发送时间:2017-10-24 20:03 
> 主题:Re: [ceph-users] [luminous]OSD memory usage increase when 
> writing a lot of data to cluster 
> 收件人:"shadow_lin"<shadow_lin@xxxxxxx> 
> 抄送:"ceph-users"<ceph-users@xxxxxxxxxxxxxx> 
>   
> On Tue, 24 Oct 2017, shadow_lin wrote:  
> > BLOCKQUOTE{margin-Top: 0px; margin-Bottom: 0px; margin-Left: 2em} body  
> > {border-width:0;margin:0} img {border:0;margin:0;padding:0} Hi All,  
> > The cluster has 24 osd with 24 8TB hdd.  
> > Each osd server has 2GB ram and runs 2OSD with 2 8TBHDD. I know the memor 
> y  
> > is below the remmanded value, but this osd server is an ARM  server so I 
?? ?> can't do anything to add more ram.  
> > I created a replicated(2 rep) pool and an 20TB image and mounted to the t 
> est  
> > server with xfs fs.   
> >    
> > I have set the ceph.conf to this(according to other related post suggeste 
> d):  
> > [osd]  
> >         bluestore_cache_size = 104857600  
> >         bluestore_cache_size_hdd = 104857600  
> >         bluestore_cache_size_ssd = 104857600  
> >         bluestore_cache_kv_max = 103809024  
> >    
> >  osd map cache size = 20  
> >         osd map max advance = 10  
> >         osd map share max epochs = 10  
> >         osd pg epoch persisted max stale = 10  
> > The bluestore cache setting did improve the situation,but if i try to wri 
> te  
> > 1TB data by dd command(dd if=/dev/zero of=test bs=1G count=1000)  to rbd 
?? ?the  
> > osd will eventually be killed by oom killer.  
> > If I only wirte like 100G  data once then everything is fine.  
> >    
> > Why does the osd memory usage keep increasing whle writing ?  
> > Is there anything I can do to reduce the memory usage?  
>   
> There is a bluestore memory bug that was fixed just after 12.2.1 was   
> released; it will be fixed in 12.2.2.  In the meantime, you can run   
> consider running the latest luminous branch (not fully tested) from  
> https://shaman.ceph.com/builds/ceph/luminous.  
>   
> sage  
>  
>  
>  
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux