Hi Adam Seems like the mds_cache_memory_limit both set globally through cephadm and the hosts mds daemons are all set to approx. 4gb root@my-ceph01:/# ceph config get mds mds_cache_memory_limit 4294967296 same if query the individual mds daemons running on my-ceph01, or any of the other mds daemons on the other hosts. On Tue, Apr 9, 2024 at 6:14 PM Mads Aasted <mads2a@xxxxxxxxx> wrote: > Hi Adam > > Let me just finish tucking in a devlish tyke here and i’ll get to it first > thing > > tirs. 9. apr. 2024 kl. 18.09 skrev Adam King <adking@xxxxxxxxxx>: > >> I did end up writing a unit test to see what we calculated here, as well >> as adding a bunch of debug logging (haven't created a PR yet, but probably >> will). The total memory was set to (19858056 * 1024 * 0.7) (total memory >> in bytes * the autotune target ratio) = 14234254540. What ended up getting >> logged was (ignore the daemon id for the daemons, they don't affect >> anything. Only the types matter) >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> *DEBUG cephadm.autotune:autotune.py:35 Autotuning OSD memory with >> given parameters:Total memory: 14234254540Daemons: >> [<DaemonDescription>(crash.a), <DaemonDescription>(grafana.a), >> <DaemonDescription>(mds.a), <DaemonDescription>(mds.b), >> <DaemonDescription>(mds.c), <DaemonDescription>(mgr.a), >> <DaemonDescription>(mon.a), <DaemonDescription>(node-exporter.a), >> <DaemonDescription>(osd.1), <DaemonDescription>(osd.2), >> <DaemonDescription>(osd.3), <DaemonDescription>(osd.4), >> <DaemonDescription>(prometheus.a)]DEBUG cephadm.autotune:autotune.py:50 >> Subtracting 134217728 from total for crash daemonDEBUG >> cephadm.autotune:autotune.py:52 new total: 14100036812DEBUG >> cephadm.autotune:autotune.py:50 Subtracting 1073741824 from total for >> grafana daemonDEBUG cephadm.autotune:autotune.py:52 new total: >> 13026294988DEBUG cephadm.autotune:autotune.py:40 Subtracting 17179869184 >> from total for mds daemonDEBUG cephadm.autotune:autotune.py:42 new >> total: -4153574196DEBUG cephadm.autotune:autotune.py:40 Subtracting >> 17179869184 from total for mds daemonDEBUG >> cephadm.autotune:autotune.py:42 new total: -21333443380DEBUG >> cephadm.autotune:autotune.py:40 Subtracting 17179869184 from total for mds >> daemonDEBUG cephadm.autotune:autotune.py:42 new total: -38513312564DEBUG >> cephadm.autotune:autotune.py:50 Subtracting 4294967296 from total for >> mgr daemonDEBUG cephadm.autotune:autotune.py:52 new total: >> -42808279860DEBUG cephadm.autotune:autotune.py:50 Subtracting 1073741824 >> from total for mon daemonDEBUG cephadm.autotune:autotune.py:52 new >> total: -43882021684DEBUG cephadm.autotune:autotune.py:50 Subtracting >> 1073741824 from total for node-exporter daemonDEBUG >> cephadm.autotune:autotune.py:52 new total: -44955763508DEBUG >> cephadm.autotune:autotune.py:50 Subtracting 1073741824 from total for >> prometheus daemonDEBUG cephadm.autotune:autotune.py:52 new total: >> -46029505332* >> >> It looks like it was taking pretty much all the memory away for the mds >> daemons. The amount, however, is taken from the "mds_cache_memory_limit" >> setting for each mds daemon. The number it was defaulting to for the test >> is quite large. I guess I'd need to know what that comes out to for the mds >> daemons in your cluster to get a full picture. Also, you can see the total >> go well into the negatives here. When that happens cephadm just tries to >> remove the osd_memory_target config settings for the OSDs on the host, but >> given the error message from your initial post, it must be getting some >> positive value when actually running on your system. >> >> On Fri, Apr 5, 2024 at 2:21 AM Mads Aasted <mads2a@xxxxxxxxx> wrote: >> >>> Hi Adam >>> No problem, i really appreciate your input :) >>> The memory stats returned are as follows >>> "memory_available_kb": 19858056, >>> "memory_free_kb": 277480, >>> "memory_total_kb": 32827840, >>> >>> On Thu, Apr 4, 2024 at 10:14 PM Adam King <adking@xxxxxxxxxx> wrote: >>> >>>> Sorry to keep asking for more info, but can I also get what `cephadm >>>> gather-facts` on that host returns for "memory_total_kb". Might end up >>>> creating a unit test out of this case if we have a calculation bug here. >>>> >>>> On Thu, Apr 4, 2024 at 4:05 PM Mads Aasted <mads2a@xxxxxxxxx> wrote: >>>> >>>>> sorry for the double send, forgot to hit reply all so it would appear >>>>> on the page >>>>> >>>>> Hi Adam >>>>> >>>>> If we multiply by 0.7, and work through the previous example from that >>>>> number, we would still arrive at roughly 2.5 gb for each osd. And the host >>>>> in question is trying to set it to less than 500mb. >>>>> I have attached a list of the processes running on the host. Currently >>>>> you can even see that the OSD's are taking up the most memory by far, and >>>>> at least 5x its proposed minimum. >>>>> root@my-ceph01:/# ceph orch ps | grep my-ceph01 >>>>> crash.my-ceph01 my-ceph01 running (3w) >>>>> 7m ago 13M 9052k - 17.2.6 >>>>> grafana.my-ceph01 my-ceph01 *:3000 running (3w) >>>>> 7m ago 13M 95.6M - 8.3.5 >>>>> mds.testfs.my-ceph01.xjxfzd my-ceph01 running (3w) >>>>> 7m ago 10M 485M - 17.2.6 >>>>> mds.prodfs.my-ceph01.rplvac my-ceph01 running (3w) >>>>> 7m ago 12M 26.9M - 17.2.6 >>>>> mds.prodfs.my-ceph01.twikzd my-ceph01 running (3w) >>>>> 7m ago 12M 26.2M - 17.2.6 >>>>> mgr.my-ceph01.rxdefe my-ceph01 *:8443,9283 running (3w) >>>>> 7m ago 13M 907M - 17.2.6 >>>>> mon.my-ceph01 my-ceph01 running (3w) >>>>> 7m ago 13M 503M 2048M 17.2.6 >>>>> node-exporter.my-ceph01 my-ceph01 *:9100 running (3w) >>>>> 7m ago 13M 20.4M - 1.5.0 >>>>> osd.3 my-ceph01 running (3w) >>>>> 7m ago 11M 2595M 4096M 17.2.6 >>>>> osd.5 my-ceph01 running (3w) >>>>> 7m ago 11M 2494M 4096M 17.2.6 >>>>> osd.6 my-ceph01 running (3w) >>>>> 7m ago 11M 2698M 4096M 17.2.6 >>>>> osd.9 my-ceph01 running (3w) >>>>> 7m ago 11M 3364M 4096M 17.2.6 >>>>> prometheus.my-ceph01 my-ceph01 *:9095 running (3w) >>>>> 7m ago 13M 164M - 2.42.0 >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, Mar 28, 2024 at 2:13 AM Adam King <adking@xxxxxxxxxx> wrote: >>>>> >>>>>> I missed a step in the calculation. The total_memory_kb I mentioned >>>>>> earlier is also multiplied by the value of the >>>>>> mgr/cephadm/autotune_memory_target_ratio before doing the subtractions for >>>>>> all the daemons. That value defaults to 0.7. That might explain it seeming >>>>>> like it's getting a value lower than expected. Beyond that, I'd think 'i'd >>>>>> need a list of the daemon types and count on that host to try and work >>>>>> through what it's doing. >>>>>> >>>>>> On Wed, Mar 27, 2024 at 10:47 AM Mads Aasted <mads2a@xxxxxxxxx> >>>>>> wrote: >>>>>> >>>>>>> Hi Adam. >>>>>>> >>>>>>> So doing the calculations with what you are stating here I arrive at >>>>>>> a total sum for all the listed processes at 13.3 (roughly) gb, for >>>>>>> everything except the osds, leaving well in excess of +4gb for each OSD. >>>>>>> Besides the mon daemon which i can tell on my host has a limit of >>>>>>> 2gb , none of the other daemons seem to have a limit set according to ceph >>>>>>> orch ps. Then again, they are nowhere near the values stated in >>>>>>> min_size_by_type that you list. >>>>>>> Obviously yes, I could disable the auto tuning, but that would leave >>>>>>> me none the wiser as to why this exact host is trying to do this. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Mar 26, 2024 at 10:20 PM Adam King <adking@xxxxxxxxxx> >>>>>>> wrote: >>>>>>> >>>>>>>> For context, the value the autotune goes with takes the value from >>>>>>>> `cephadm gather-facts` on the host (the "memory_total_kb" field) and then >>>>>>>> subtracts from that per daemon on the host according to >>>>>>>> >>>>>>>> min_size_by_type = { >>>>>>>> 'mds': 4096 * 1048576, >>>>>>>> 'mgr': 4096 * 1048576, >>>>>>>> 'mon': 1024 * 1048576, >>>>>>>> 'crash': 128 * 1048576, >>>>>>>> 'keepalived': 128 * 1048576, >>>>>>>> 'haproxy': 128 * 1048576, >>>>>>>> 'nvmeof': 4096 * 1048576, >>>>>>>> } >>>>>>>> default_size = 1024 * 1048576 >>>>>>>> >>>>>>>> what's left is then divided by the number of OSDs on the host to >>>>>>>> arrive at the value. I'll also add, since it seems to be an issue on this >>>>>>>> particular host, if you add the "_no_autotune_memory" label to the host, >>>>>>>> it will stop trying to do this on that host. >>>>>>>> >>>>>>>> On Mon, Mar 25, 2024 at 6:32 PM <mads2a@xxxxxxxxx> wrote: >>>>>>>> >>>>>>>>> I have a virtual ceph cluster running 17.2.6 with 4 ubuntu 22.04 >>>>>>>>> hosts in it, each with 4 OSD's attached. The first 2 servers hosting mgr's >>>>>>>>> have 32GB of RAM each, and the remaining have 24gb >>>>>>>>> For some reason i am unable to identify, the first host in the >>>>>>>>> cluster appears to constantly be trying to set the osd_memory_target >>>>>>>>> variable to roughly half of what the calculated minimum is for the cluster, >>>>>>>>> i see the following spamming the logs constantly >>>>>>>>> Unable to set osd_memory_target on my-ceph01 to 480485376: error >>>>>>>>> parsing value: Value '480485376' is below minimum 939524096 >>>>>>>>> Default is set to 4294967296. >>>>>>>>> I did double check and osd_memory_base (805306368) + >>>>>>>>> osd_memory_cache_min (134217728) adds up to minimum exactly >>>>>>>>> osd_memory_target_autotune is currently enabled. But i cannot for >>>>>>>>> the life of me figure out how it is arriving at 480485376 as a value for >>>>>>>>> that particular host that even has the most RAM. Neither the cluster or the >>>>>>>>> host is even approaching max utilization on memory, so it's not like there >>>>>>>>> are processes competing for resources. >>>>>>>>> _______________________________________________ >>>>>>>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>>>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>>>>>>> >>>>>>>>> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx