Reed Dier <reed.dier@xxxxxxxxxxx> writes: > I don't have a solution to offer, but I've seen this for years with no solution. > Any time a MGR bounces, be it for upgrades, or a new daemon coming online, etc, I'll see a scale spike like is reported below. Interesting to read that we are not the only ones. > Just out of curiosity, which MGR plugins are you using? [22:11:05] black2.place6:~# ceph mgr module ls { "always_on_modules": [ "balancer", "crash", "devicehealth", "orchestrator_cli", "progress", "rbd_support", "status", "volumes" ], "enabled_modules": [ "iostat", "pg_autoscaler", "prometheus", "restful" ], > I have historically used the influx plugin for stats exports, and it shows up in those values as well, throwing everything off. So the problem is unlikely related to the prometheus plugin, but more to a statistics error somewhere else. > I don't see it in my Zabbix stats, albeit those are scraped at a > longer interval that may not catch this. For prometheus, we scrape every 10 or 15 seconds. But I wonder if this really flattens out or whether the logic is actually different. Out of curiosity from my side: the manager is a binary, but the plugins are actually python modules. I had a quick look at /usr/share/ceph/mgr/prometheus/module.py which seems to get the data from a monitor - so I wonder if the problem lies more in the architecture of ceph rather than the actual data export. Cheers, Nico -- Sustainable and modern Infrastructures by ungleich.ch _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx