Hi Mark,
thanks a lot for your explanation and clarification.
Adjusting osd_memory_target to fit in our systems did the trick.
Jaime
On 07/08/2019 14:09, Mark Nelson wrote:
Hi Jaime,
we only use the cache size parameters now if you've disabled
autotuning. With autotuning we adjust the cache size on the fly to
try and keep the mapped process memory under the osd_memory_target.
You can set a lower memory target than default, though you will have
far less cache for bluestore onodes and rocksdb. You may notice that
it's slower, especially if you have a big active data set you are
processing. I don't usually recommend setting the osd_memory_target
below 2GB. At some point it will have shrunk the caches as far as it
can and the process memory may start exceeding the target. (with our
default rocksdb and pglog settings this usually happens somewhere
between 1.3-1.7GB once the OSD has been sufficiently saturated with
IO). Given memory prices right now, I'd still recommend upgrading RAM
if you have the ability though. You might be able to get away with
setting each OSD to 2-2.5GB in your scenario but you'll be pushing it.
I would not recommend lowering the osd_memory_cache_min. You really
want rocksdb indexes/filters fitting in cache, and as many bluestore
onodes as you can get. In any event, you'll still be bound by the
(currently hardcoded) 64MB cache chunk allocation size in the
autotuner which osd_memory_cache_min can't reduce (and that's per
cache while osd_memory_cache_min is global for the kv,buffer, and
rocksdb block caches). IE each cache is going to get 64MB+growth room
regardless of how low you set osd_memory_cache_min. That's
intentional as we don't want a single SST file in rocksdb to be able
to completely blow everything else out of the block cache during
compaction, only to quickly become invalid, removed from the cache,
and make it look to the priority cache system like rocksdb doesn't
actually need any more memory for cache.
Mark
On 8/7/19 7:44 AM, Jaime Ibar wrote:
Hi all,
we run a Ceph Luminous 12.2.12 cluster, 7 osds servers 12x4TB disks
each.
Recently we redeployed the osds of one of them using bluestore backend,
however, after this, we're facing Out of memory errors(invoked
oom-killer)
and the OS kills one of the ceph-osd process.
The osd is restarted automatically and back online after one minute.
We're running Ubuntu 16.04, kernel 4.15.0-55-generic.
The server has 32GB of RAM and 4GB of swap partition.
All the disks are hdd, no ssd disks.
Bluestore settings are the default ones
"osd_memory_target": "4294967296"
"osd_memory_cache_min": "134217728"
"bluestore_cache_size": "0"
"bluestore_cache_size_hdd": "1073741824"
"bluestore_cache_autotune": "true"
As stated in the documentation, bluestore assigns by default 4GB of
RAM per osd(1GB of RAM for 1TB).
So in this case 48GB of RAM would be needed. Am I right?
Are these the minimun requirements for bluestore?
In case adding more RAM is not an option, can any of
osd_memory_target, osd_memory_cache_min, bluestore_cache_size_hdd
be decrease to fit in our server specs?
Would this have any impact on performance?
Thanks
Jaime
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Jaime Ibar
High Performance & Research Computing, IS Services
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
http://www.tchpc.tcd.ie/ | jaime@xxxxxxxxxxxx
Tel: +353-1-896-3725
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com