HI Andreas, That's good to know. I managed to fix the problem! Here is my journey in case it helps anyone: My system drives are only 512GB so I added spare 1Tb drives to each server and moved the mon db to the new drive. I set noout, nobackfill and norecover and enabled only the ceph mon and osd services (disabled mgr and mds in case they were throwing the log messages). I then let it sit. In the first hour the db expanded: mon.a: 1GB ->80GB mon.b: 500GB ->550GB mon.c: 500GB ->500GB then after another hour mon.a increased to 100GB but mon.c dropped to 50GB. After another hour mon.a and mon.c were down to ~10Gb. By the next morning the final mon was also ~10Gb and the cluster was happy again. Thank you ceph! It would be great to know what caused this initial inflation but my take away is to keep the mon db on a drive separate the OS in case of db overinflation (and the 10GB min hardware requirements should have an asterisk if this is a common issue). I think part of my issue was that inflation started interfering with OS functions, exacerbating things. Thanks all for your help. Definitely helped me sort things out. Best, Ricardo -----Original Message----- From: Andreas John <aj@xxxxxxxxxxx> Sent: Thursday, March 11, 2021 2:32 AM To: ceph-users@xxxxxxx Subject: Re: mon db growing. over 500Gb Hello, I also observed excessively growing mon DB in case of recovery. Luckily we were able to solve it by exdending the mon db disk. Without having the chance to re-check: The options nobackfill and norecover might cause that behavior.It feelds like mon holds data that cannot be flushed to an OSD. rgds, j. On 11.03.21 10:47, Marc wrote: > From what I have read here in the past, growing monitor db is related > to not having pg's in 'clean active' state > > >> -----Original Message----- >> From: ricardo.re.azevedo@xxxxxxxxx <ricardo.re.azevedo@xxxxxxxxx> >> Sent: 11 March 2021 00:59 >> To: ceph-users@xxxxxxx >> Subject: mon db growing. over 500Gb >> >> Hi all, >> >> >> >> I have a fairly pressing issue. I had a monitor fall out of quorum >> because it ran out of disk space during rebalancing from switching to >> upmap. I noticed all my monitor store.db started taking up nearly all >> disk space so I set noout, nobackfill and norecover and shutdown all >> the monitor daemons. >> Each store.db was at: >> >> >> >> mon.a 89GB (the one that firt dropped out) >> >> mon.a 400GB >> >> mon.c 400GB >> >> >> I tried setting mon_compact_on_start. This brought mon.a down to 1GB. >> Cool. >> However, when I try it on the other monitors it increased the db size >> ~1Gb/10s so I shut them down again. >> >> Any idea what is going on? Or how can I shrik back down the db? >> >> >> >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an >> email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an > email to ceph-users-leave@xxxxxxx > -- Andreas John net-lab GmbH | Frankfurter Str. 99 | 63067 Offenbach Geschaeftsfuehrer: Andreas John | AG Offenbach, HRB40832 Tel: +49 69 8570033-1 | Fax: -2 | http://www.net-lab.net Facebook: https://www.facebook.com/netlabdotnet Twitter: https://twitter.com/netlabdotnet _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx