Re: Bluestore Runaway Memory

Mark Nelson <mnelson@xxxxxxxxxx> · Thu, 18 Jul 2019 18:42:17 -0500

Hi Brett,

Can you enable debug_bluestore = 5 and debug_prioritycache = 5 on one of 
the OSDs that's showing the behavior?  You'll want to look in the logs 
for lines that look like this:

2019-07-18T19:34:42.587-0400 7f4048b8d700  5 prioritycache tune_memory 
target: 4294967296 mapped: 4260962304 unmapped: 856948736 heap: 
5117911040 old mem: 2845415707 new mem: 2845415707
2019-07-18T19:34:33.527-0400 7f4048b8d700  5 
bluestore.MempoolThread(0x55a6d330ead0) _resize_shards cache_size: 
2845415707 kv_alloc: 1241513984 kv_used: 874833889 meta_alloc: 
1258291200 meta_used: 889040246 data_alloc: 318767104 data_used: 0

The first line will tell you what your memory target is set to, how much 
memory is currently mapped, how much is unmapped (ie what's been freed 
but the kernel hasn't reclaimed), the total heap size, and the old and 
new aggregate size for all of bluestores caches.  The second line also 
tells you the aggregate cache size, and then how much space is being 
allocated and used for the kv, meta, and data caches.  If there's a leak 
somewhere in the OSD or bluestore the autotuner will shrink the cache 
way down but eventually won't be able to contain it and eventually your 
process will start growing beyond the target size despite having a tiny 
amount of bluestore cache.  If it's something else like a huge amount of 
freed memory not being reclaimed by the kernel, you'll see large amount 
of unmapped memory and a big heap size despite the mapped memory staying 
near the target.  If it's a bug in the autotuner, we might see the 
mapped memory greatly exceeding the target.

Mark

On 7/18/19 4:02 PM, Brett Kelly wrote:

Hello,

We have a Nautilus cluster exhibiting what looks like this bug: 
https://tracker.ceph.com/issues/39618

No matter what is set as the osd_memory_target (currently 2147483648 
), each OSD process will surpass this value and peak around ~4.0GB 
then eventually start using swap. Cluster stays stable for about a 
week and then starts running into OOM issues, kills off OSDs and 
requires a reboot of each node to get back to a stable state.

Has anyone run into similar/workarounds ?

Ceph version: 14.2.1, RGW Clients

CentOS Linux release 7.6.1810 (Core)

Kernel: 3.10.0-957.12.1.el7.x86_64

256GB RAM per OSD node, 60 OSD's in each node.

Thanks,

--
Brett Kelly

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com