I missed a step in the calculation. The total_memory_kb I mentioned earlier is also multiplied by the value of the mgr/cephadm/autotune_memory_target_ratio before doing the subtractions for all the daemons. That value defaults to 0.7. That might explain it seeming like it's getting a value lower than expected. Beyond that, I'd think 'i'd need a list of the daemon types and count on that host to try and work through what it's doing. On Wed, Mar 27, 2024 at 10:47 AM Mads Aasted <mads2a@xxxxxxxxx> wrote: > Hi Adam. > > So doing the calculations with what you are stating here I arrive at a > total sum for all the listed processes at 13.3 (roughly) gb, for everything > except the osds, leaving well in excess of +4gb for each OSD. > Besides the mon daemon which i can tell on my host has a limit of 2gb , > none of the other daemons seem to have a limit set according to ceph orch > ps. Then again, they are nowhere near the values stated in min_size_by_type > that you list. > Obviously yes, I could disable the auto tuning, but that would leave me > none the wiser as to why this exact host is trying to do this. > > > > On Tue, Mar 26, 2024 at 10:20 PM Adam King <adking@xxxxxxxxxx> wrote: > >> For context, the value the autotune goes with takes the value from >> `cephadm gather-facts` on the host (the "memory_total_kb" field) and then >> subtracts from that per daemon on the host according to >> >> min_size_by_type = { >> 'mds': 4096 * 1048576, >> 'mgr': 4096 * 1048576, >> 'mon': 1024 * 1048576, >> 'crash': 128 * 1048576, >> 'keepalived': 128 * 1048576, >> 'haproxy': 128 * 1048576, >> 'nvmeof': 4096 * 1048576, >> } >> default_size = 1024 * 1048576 >> >> what's left is then divided by the number of OSDs on the host to arrive >> at the value. I'll also add, since it seems to be an issue on this >> particular host, if you add the "_no_autotune_memory" label to the host, >> it will stop trying to do this on that host. >> >> On Mon, Mar 25, 2024 at 6:32 PM <mads2a@xxxxxxxxx> wrote: >> >>> I have a virtual ceph cluster running 17.2.6 with 4 ubuntu 22.04 hosts >>> in it, each with 4 OSD's attached. The first 2 servers hosting mgr's have >>> 32GB of RAM each, and the remaining have 24gb >>> For some reason i am unable to identify, the first host in the cluster >>> appears to constantly be trying to set the osd_memory_target variable to >>> roughly half of what the calculated minimum is for the cluster, i see the >>> following spamming the logs constantly >>> Unable to set osd_memory_target on my-ceph01 to 480485376: error parsing >>> value: Value '480485376' is below minimum 939524096 >>> Default is set to 4294967296. >>> I did double check and osd_memory_base (805306368) + >>> osd_memory_cache_min (134217728) adds up to minimum exactly >>> osd_memory_target_autotune is currently enabled. But i cannot for the >>> life of me figure out how it is arriving at 480485376 as a value for that >>> particular host that even has the most RAM. Neither the cluster or the host >>> is even approaching max utilization on memory, so it's not like there are >>> processes competing for resources. >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> >>> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx