On 5/13/20 12:43 AM, RafaĹ? WÄ?doĹ?owski wrote:
Hi, I noticed strange situation in one of our clusters. The OSD deamons are taking too much RAM. We are running 12.2.12 and have default configuration of osd_memory_target (4GiB). Heap dump shows: osd.2969 dumping heap profile now. ------------------------------------------------ MALLOC: 6381526944 ( 6085.9 MiB) Bytes in use by application MALLOC: + 0 ( 0.0 MiB) Bytes in page heap freelist MALLOC: + 173373288 ( 165.3 MiB) Bytes in central cache freelist MALLOC: + 17163520 ( 16.4 MiB) Bytes in transfer cache freelist MALLOC: + 95339512 ( 90.9 MiB) Bytes in thread cache freelists MALLOC: + 28995744 ( 27.7 MiB) Bytes in malloc metadata MALLOC: ------------ MALLOC: = 6696399008 ( 6386.2 MiB) Actual memory used (physical + swap) MALLOC: + 218267648 ( 208.2 MiB) Bytes released to OS (aka unmapped) MALLOC: ------------ MALLOC: = 6914666656 ( 6594.3 MiB) Virtual address space used MALLOC: MALLOC: 408276 Spans in use MALLOC: 75 Thread heaps in use MALLOC: 8192 Tcmalloc page size ------------------------------------------------ Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). Bytes released to the OS take up virtual address space but no physical memory. IMO "Bytes in use by application" should be less than osd_memory_target. Am I correct? I checked heap dump with google-pprof and got following results. Total: 149.4 MB 60.5 40.5% 40.5% 60.5 40.5% rocksdb::UncompressBlockContentsForCompressionType 34.2 22.9% 63.4% 34.2 22.9% ceph::buffer::create_aligned_in_mempool 11.9 7.9% 71.3% 12.1 8.1% std::_Rb_tree::_M_emplace_hint_unique 10.7 7.1% 78.5% 71.2 47.7% rocksdb::ReadBlockContents Does it mean that most of RAM is used by rocksdb?
It looks like your heap dump is only accounting for 149.4MB of the memory so probably not representative across the whole ~6.5G. Instead could you try dumping the mempools via "ceph daemon osd.2969 dump_mempools"?
How can I take a deeper look into memory usage ?
Beyond looking at the mempools, you can see the bluestore cache allocation information by either enabling debug bluestore and debug priority_cache_manager 5, or potentially looking at the PCM perf counters (I'm not sure if those were in 14.2.12 though). Between the heap data, mempool data, and priority cache records, it should become clearer what's going on.
Mark
Regards, RafaĹ? WÄ?doĹ?owski _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx