Bluestore + erasure coding memory usage

"bobobo1618@xxxxxxxxx" <bobobo1618@xxxxxxxxx> · Wed, 2 Nov 2016 19:07:06 -0700

I'm running Kraken built from Git right now and I've found that my OSDs eat as much memory as they can before they're killed by OOM. I understand that Bluestore is experimental but thought the fact that it does this should be known.
My setup:
- Xeon D-1540, 32GB DDR4 ECC RAM
- Arch Linux
- Single node, 4 8TB OSDs, each prepared with "ceph-disk prepare --bluestore /dev/sdX"
- Built from Git fac6335a1eea12270f76cf2c7814648669e6515a

Steps to reproduce:
- Start mon
- Start OSDs
- ceph osd pool create pool 256 256 erasure myprofile storage
- rados bench -p pool <time> write -t 32
- ceph osd pool delete pool
- ceph osd pool create pool 256 256 replicated
- rados bench -p pool <time> write -t 32
- ceph osd pool delete pool

The OSDs start at ~500M used each (according to "ceph tell osd.0 heap stats"), before they're allocated PGs. After creating and peering PGs, they're at ~514M each.

After running rados bench for 10s, memory is at ~727M each. Running pprof on a dump shows the top entry as:

218.9  96.1%  96.1%    218.9  96.1% ceph::buffer::create_aligned

Running rados bench another 10s pushes memory to 836M each. pprof again shows similar results:

305.2  96.8%  96.8%    305.2  96.8% ceph::buffer::create_aligned

I can continue this process until the OSDs are killed by OOM.

This only happens with Bluestore, other backends (like filestore) work fine.

When I delete the pool, the OSDs release the memory and return to their ~500M resting point.

Repeating the test with a replicated pool results in the OSDs consuming elevated memory (~610M peak) while writing but returning to resting levels when writing ends.

It'd be great if I could do something about this myself but I don't understand the code very well and I can't figure out if there's a way to trace the path taken for the memory to be allocated like there is for CPU usage.

Any advice or solution would be much appreciated.

Thanks!

Lucas
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com