Good point, thanks ! By making memory pressure (by playing with vm.min_free_kbytes), memory is freed by the kernel. So I think I essentially need to update monitoring rules, to avoid false positive. Thanks, I continue to read your resources. Le mardi 09 avril 2019 à 09:30 -0500, Mark Nelson a écrit : > My understanding is that basically the kernel is either unable or > uninterested (maybe due to lack of memory pressure?) in reclaiming > the > memory . It's possible you might have better behavior if you set > /sys/kernel/mm/khugepaged/max_ptes_none to a low value (maybe 0) or > maybe disable transparent huge pages entirely. > > > Some background: > > https://github.com/gperftools/gperftools/issues/1073 > > https://blog.nelhage.com/post/transparent-hugepages/ > > https://www.kernel.org/doc/Documentation/vm/transhuge.txt > > > Mark > > > On 4/9/19 7:31 AM, Olivier Bonvalet wrote: > > Well, Dan seems to be right : > > > > _tune_cache_size > > target: 4294967296 > > heap: 6514409472 > > unmapped: 2267537408 > > mapped: 4246872064 > > old cache_size: 2845396873 > > new cache size: 2845397085 > > > > > > So we have 6GB in heap, but "only" 4GB mapped. > > > > But "ceph tell osd.* heap release" should had release that ? > > > > > > Thanks, > > > > Olivier > > > > > > Le lundi 08 avril 2019 à 16:09 -0500, Mark Nelson a écrit : > > > One of the difficulties with the osd_memory_target work is that > > > we > > > can't > > > tune based on the RSS memory usage of the process. Ultimately > > > it's up > > > to > > > the kernel to decide to reclaim memory and especially with > > > transparent > > > huge pages it's tough to judge what the kernel is going to do > > > even > > > if > > > memory has been unmapped by the process. Instead the autotuner > > > looks > > > at > > > how much memory has been mapped and tries to balance the caches > > > based > > > on > > > that. > > > > > > > > > In addition to Dan's advice, you might also want to enable debug > > > bluestore at level 5 and look for lines containing "target:" and > > > "cache_size:". These will tell you the current target, the > > > mapped > > > memory, unmapped memory, heap size, previous aggregate cache > > > size, > > > and > > > new aggregate cache size. The other line will give you a break > > > down > > > of > > > how much memory was assigned to each of the bluestore caches and > > > how > > > much each case is using. If there is a memory leak, the > > > autotuner > > > can > > > only do so much. At some point it will reduce the caches to fit > > > within > > > cache_min and leave it there. > > > > > > > > > Mark > > > > > > > > > On 4/8/19 5:18 AM, Dan van der Ster wrote: > > > > Which OS are you using? > > > > With CentOS we find that the heap is not always automatically > > > > released. (You can check the heap freelist with `ceph tell > > > > osd.0 > > > > heap > > > > stats`). > > > > As a workaround we run this hourly: > > > > > > > > ceph tell mon.* heap release > > > > ceph tell osd.* heap release > > > > ceph tell mds.* heap release > > > > > > > > -- Dan > > > > > > > > On Sat, Apr 6, 2019 at 1:30 PM Olivier Bonvalet < > > > > ceph.list@xxxxxxxxx> wrote: > > > > > Hi, > > > > > > > > > > on a Luminous 12.2.11 deploiement, my bluestore OSD exceed > > > > > the > > > > > osd_memory_target : > > > > > > > > > > daevel-ob@ssdr712h:~$ ps auxw | grep ceph-osd > > > > > ceph 3646 17.1 12.0 6828916 5893136 ? Ssl mars29 > > > > > 1903:42 /usr/bin/ceph-osd -f --cluster ceph --id 143 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > ceph 3991 12.9 11.2 6342812 5485356 ? Ssl mars29 > > > > > 1443:41 /usr/bin/ceph-osd -f --cluster ceph --id 144 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > ceph 4361 16.9 11.8 6718432 5783584 ? Ssl mars29 > > > > > 1889:41 /usr/bin/ceph-osd -f --cluster ceph --id 145 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > ceph 4731 19.7 12.2 6949584 5982040 ? Ssl mars29 > > > > > 2198:47 /usr/bin/ceph-osd -f --cluster ceph --id 146 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > ceph 5073 16.7 11.6 6639568 5701368 ? Ssl mars29 > > > > > 1866:05 /usr/bin/ceph-osd -f --cluster ceph --id 147 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > ceph 5417 14.6 11.2 6386764 5519944 ? Ssl mars29 > > > > > 1634:30 /usr/bin/ceph-osd -f --cluster ceph --id 148 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > ceph 5760 16.9 12.0 6806448 5879624 ? Ssl mars29 > > > > > 1882:42 /usr/bin/ceph-osd -f --cluster ceph --id 149 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > ceph 6105 16.0 11.6 6576336 5694556 ? Ssl mars29 > > > > > 1782:52 /usr/bin/ceph-osd -f --cluster ceph --id 150 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > > > > > > daevel-ob@ssdr712h:~$ free -m > > > > > total used free shared bu > > > > > ff/ca > > > > > che available > > > > > Mem: 47771 45210 1643 17 > > > > > 9 > > > > > 17 43556 > > > > > Swap: 0 0 0 > > > > > > > > > > # ceph daemon osd.147 config show | grep memory_target > > > > > "osd_memory_target": "4294967296", > > > > > > > > > > > > > > > And there is no recovery / backfilling, the cluster is fine : > > > > > > > > > > $ ceph status > > > > > cluster: > > > > > id: de035250-323d-4cf6-8c4b-cf0faf6296b1 > > > > > health: HEALTH_OK > > > > > > > > > > services: > > > > > mon: 5 daemons, quorum > > > > > tolriq,tsyne,olkas,lorunde,amphel > > > > > mgr: tsyne(active), standbys: olkas, tolriq, > > > > > lorunde, > > > > > amphel > > > > > osd: 120 osds: 116 up, 116 in > > > > > > > > > > data: > > > > > pools: 20 pools, 12736 pgs > > > > > objects: 15.29M objects, 31.1TiB > > > > > usage: 101TiB used, 75.3TiB / 177TiB avail > > > > > pgs: 12732 active+clean > > > > > 4 active+clean+scrubbing+deep > > > > > > > > > > io: > > > > > client: 72.3MiB/s rd, 26.8MiB/s wr, 2.30kop/s rd, > > > > > 1.29kop/s wr > > > > > > > > > > > > > > > On an other host, in the same pool, I see also high > > > > > memory > > > > > usage : > > > > > > > > > > daevel-ob@ssdr712g:~$ ps auxw | grep ceph-osd > > > > > ceph 6287 6.6 10.6 6027388 5190032 > > > > > ? Ssl mars21 > > > > > 1511:07 /usr/bin/ceph-osd -f --cluster ceph --id 131 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > ceph 6759 7.3 11.2 6299140 5484412 > > > > > ? Ssl mars21 > > > > > 1665:22 /usr/bin/ceph-osd -f --cluster ceph --id 132 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > ceph 7114 7.0 11.7 6576168 5756236 > > > > > ? Ssl mars21 > > > > > 1612:09 /usr/bin/ceph-osd -f --cluster ceph --id 133 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > ceph 7467 7.4 11.1 6244668 5430512 > > > > > ? Ssl mars21 > > > > > 1704:06 /usr/bin/ceph-osd -f --cluster ceph --id 134 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > ceph 7821 7.7 11.1 6309456 5469376 > > > > > ? Ssl mars21 > > > > > 1754:35 /usr/bin/ceph-osd -f --cluster ceph --id 135 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > ceph 8174 6.9 11.6 6545224 5705412 > > > > > ? Ssl mars21 > > > > > 1590:31 /usr/bin/ceph-osd -f --cluster ceph --id 136 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > ceph 8746 6.6 11.1 6290004 5477204 > > > > > ? Ssl mars21 > > > > > 1511:11 /usr/bin/ceph-osd -f --cluster ceph --id 137 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > ceph 9100 7.7 11.6 6552080 5713560 > > > > > ? Ssl mars21 > > > > > 1757:22 /usr/bin/ceph-osd -f --cluster ceph --id 138 -- > > > > > setuser > > > > > ceph --setgroup ceph > > > > > > > > > > But ! On a similar host, in a different pool, the > > > > > problem is > > > > > less visible : > > > > > > > > > > daevel-ob@ssdr712i:~$ ps auxw | grep ceph-osd > > > > > ceph 3617 2.8 9.9 5660308 4847444 > > > > > ? Ssl mars29 > > > > > 313:05 /usr/bin/ceph-osd -f --cluster ceph --id 151 --setuser > > > > > ceph --setgroup ceph > > > > > ceph 3958 2.3 9.8 5661936 4834320 > > > > > ? Ssl mars29 > > > > > 256:55 /usr/bin/ceph-osd -f --cluster ceph --id 152 --setuser > > > > > ceph --setgroup ceph > > > > > ceph 4299 2.3 9.8 5620616 4807248 > > > > > ? Ssl mars29 > > > > > 266:26 /usr/bin/ceph-osd -f --cluster ceph --id 153 --setuser > > > > > ceph --setgroup ceph > > > > > ceph 4643 2.3 9.6 5527724 4713572 > > > > > ? Ssl mars29 > > > > > 262:50 /usr/bin/ceph-osd -f --cluster ceph --id 154 --setuser > > > > > ceph --setgroup ceph > > > > > ceph 5016 2.2 9.7 5597504 4783412 > > > > > ? Ssl mars29 > > > > > 248:37 /usr/bin/ceph-osd -f --cluster ceph --id 155 --setuser > > > > > ceph --setgroup ceph > > > > > ceph 5380 2.8 9.9 5700204 4886432 > > > > > ? Ssl mars29 > > > > > 321:05 /usr/bin/ceph-osd -f --cluster ceph --id 156 --setuser > > > > > ceph --setgroup ceph > > > > > ceph 5724 3.1 10.1 5767456 4953484 > > > > > ? Ssl mars29 > > > > > 352:55 /usr/bin/ceph-osd -f --cluster ceph --id 157 --setuser > > > > > ceph --setgroup ceph > > > > > ceph 6070 2.7 9.9 5683092 4868632 > > > > > ? Ssl mars29 > > > > > 309:10 /usr/bin/ceph-osd -f --cluster ceph --id 158 --setuser > > > > > ceph --setgroup ceph > > > > > > > > > > > > > > > Is there some memory leak ? Or should I expect that > > > > > osd_memory_target > > > > > (the default 4GB here) is not really followed, and so > > > > > reduce > > > > > it ? > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > _______________________________________________ > > > > > ceph-users mailing list > > > > > ceph-users@xxxxxxxxxxxxxx > > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > _______________________________________________ > > > > ceph-users mailing list > > > > ceph-users@xxxxxxxxxxxxxx > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@xxxxxxxxxxxxxx > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com