Bluestore memory usage on our test cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Based on the recent conversation about bluestore memory usage, I did a survey of all of the bluestore OSDs in one of our internal test clusters. The one with the highest RSS usage at the time was osd.82:

6017 ceph 20 0 4488440 2.648g 5004 S 3.0 16.9 5598:01 ceph-osd

In the grand scheme of bluestore memory usage, I've seen higher RSS usage, but usually with bluestore_cache cranked up higher. On these nodes, I believe Sage said the bluestore_cache size is being set to 512MB to keep memory usage down.

To dig into this more, mempool data from the osd can be dumped via:

sudo ceph daemon osd.82 dump_mempools

A slightly compressed version of that data follows. Note that the allocated space for blueestore_cache_* isn't terribly high. buffer_anon and osd_pglog together are taking up more space:

bloom_filters: 0MB
bluestore_alloc: 13.5MB
blustore_cache_data: 0MB
bluestore_cache_onode: 234.7MB
bluestore_cache_other: 277.3MB
bluestore_fsck: 0MB
bluestore_txc: 0MB
bluestore_writing_deferred: 5.4MB
bluestore_writing: 11.1MB
bluefs: 0.1MB
buffer_anon: 386.1MB
buffer_meta: 0MB
osd: 4.4MB
osd_mapbl: 0MB
osd_pglog: 181.4MB
osdmap: 0.7MB
osdmap_mapping: 0MB
pgmap: 0MB
unittest_1: 0MB
unittest_2: 0MB

total: 1114.8MB

A heap dump from tcmalloc shows a fair amount of data yet to be returned to the OS:

sudo ceph tell osd.82 heap start_profiler
sudo ceph tell osd.82 heap dump

osd.82 dumping heap profile now.
------------------------------------------------
MALLOC:     2364583720 ( 2255.0 MiB) Bytes in use by application
MALLOC: +            0 (    0.0 MiB) Bytes in page heap freelist
MALLOC: +    360267096 (  343.6 MiB) Bytes in central cache freelist
MALLOC: +     10953808 (   10.4 MiB) Bytes in transfer cache freelist
MALLOC: +    114290480 (  109.0 MiB) Bytes in thread cache freelists
MALLOC: +     13562016 (   12.9 MiB) Bytes in malloc metadata
MALLOC:   ------------
MALLOC: =   2863657120 ( 2731.0 MiB) Actual memory used (physical + swap)
MALLOC: +    997007360 (  950.8 MiB) Bytes released to OS (aka unmapped)
MALLOC:   ------------
MALLOC: =   3860664480 ( 3681.8 MiB) Virtual address space used
MALLOC:
MALLOC:         156783              Spans in use
MALLOC:             35              Thread heaps in use
MALLOC:           8192              Tcmalloc page size
------------------------------------------------


The heap profile is showing us about the same as top excluding bytes released to the OS. Another ~500MB is being used by tcmalloc for various cache and metadata, and ~1.1GB we can account for in the mempools.

The question is where does that other 1GB go. Is it allocations that are not made via the mempools? heap fragmentation? Maybe a combination of multiple things? I don't actually know how to get heap fragmentation statistics out of tcmalloc, but jemalloc potentially would allow us to compute it via:

malloc_stats_print()

External fragmentation: 1.0 - (allocated/active)
Virtual fragmentation: 1.0 - (active/mapped)

Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux