Thanks for the input Lincoln. I think I am in a similar boat. I don't have the insight module activated. I checked one of my troublesome monitors with the command you game and indeed it is full of logm messages. I am not sure what would have caused it though. My OSDs have been behaving relatively ok. I tried rebooting all my OSDs with mds and mgr disabled and am in the same spot. Starting the mon manually with mon_compact_on_start gives: ceph@a /m/c/m/c/store.db> /usr/bin/ceph-mon -f --cluster ceph --id a --setuser ceph --setgroup ceph ignoring --setuser ceph since I am not root ignoring --setgroup ceph since I am not root 2021-03-10T16:47:10.017-0800 7f076f143540 -1 compacting monitor store ... It hangs on compaction meanwhile the store.db keeps expanding. Seems like there is something wrong with compaction since I don't think the mon is connected yet at this point and I have every other ceph service disabled. I had to increase ulimit at this point. any thoughts on how to procede? Is there a way I can clear the db of these messages? Thanks everyone From: Lincoln Bryant <lincolnb@xxxxxxxxxxxx> Sent: Wednesday, March 10, 2021 4:06 PM To: ricardo.re.azevedo@xxxxxxxxx; ceph-users@xxxxxxx Subject: Re: mon db growing. over 500Gb Hi Ricardo, I just had a similar issue recently. I did a dump of the monitor store (i.e., something like "ceph-monstore-tool /var/lib/ceph/mon/mon-a/ dump-keys") and most messages were of type 'logm'. For me I think it was a lot of log messages coming from an oddly behaving OSD. I've seen folks advise disabling the Ceph mgr insights module if you have it running and there are degraded PGs, to see if that helps. What finally solved it for me was doing a rolling restart of my nodes, but I started from all PGs active+clean. --Lincoln _____ From: ricardo.re.azevedo@xxxxxxxxx <mailto:ricardo.re.azevedo@xxxxxxxxx> <ricardo.re.azevedo@xxxxxxxxx <mailto:ricardo.re.azevedo@xxxxxxxxx> > Sent: Wednesday, March 10, 2021 5:59 PM To: ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx> <ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx> > Subject: mon db growing. over 500Gb Hi all, I have a fairly pressing issue. I had a monitor fall out of quorum because it ran out of disk space during rebalancing from switching to upmap. I noticed all my monitor store.db started taking up nearly all disk space so I set noout, nobackfill and norecover and shutdown all the monitor daemons. Each store.db was at: mon.a 89GB (the one that firt dropped out) mon.a 400GB mon.c 400GB I tried setting mon_compact_on_start. This brought mon.a down to 1GB. Cool. However, when I try it on the other monitors it increased the db size ~1Gb/10s so I shut them down again. Any idea what is going on? Or how can I shrik back down the db? _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx> To unsubscribe send an email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx