Re: mon db growing. over 500Gb

<ricardo.re.azevedo@xxxxxxxxx> · Thu, 11 Mar 2021 11:40:51 -0800

HI Andreas,

That's good to know. I managed to fix the problem! Here is my journey in
case it helps anyone:

My system drives are only 512GB so I added spare 1Tb drives to each server
and moved the mon db to the new drive. I set noout, nobackfill and norecover
and enabled only the ceph mon and osd services (disabled mgr and mds in case
they were throwing the log messages). I then let it sit. In the first hour
the db expanded:

mon.a: 1GB ->80GB
mon.b: 500GB ->550GB
mon.c:  500GB ->500GB

then after another hour mon.a increased to 100GB but mon.c dropped to 50GB.
After another hour mon.a and mon.c were down to ~10Gb. By the next morning
the final mon was also ~10Gb and the cluster was happy again. Thank you
ceph!

It would be great to know what caused this initial inflation but my take
away is to keep the mon db on a drive separate the OS in case of db
overinflation (and the 10GB min hardware requirements should have an
asterisk if this is a common issue). I think part of my issue was that
inflation started interfering with OS functions, exacerbating things.

Thanks all for your help. Definitely helped me sort things out.

Best,
Ricardo

-----Original Message-----
From: Andreas John <aj@xxxxxxxxxxx> 
Sent: Thursday, March 11, 2021 2:32 AM
To: ceph-users@xxxxxxx
Subject:  Re: mon db growing. over 500Gb

Hello,

I also observed excessively growing mon DB in case of recovery. Luckily we
were able to solve it by exdending the mon db disk.

Without having the chance to re-check: The options nobackfill and norecover
might cause that behavior.It feelds like mon holds data that cannot be
flushed to an OSD.

rgds,

j.

On 11.03.21 10:47, Marc wrote:
> From what I have read here in the past, growing monitor db is related 
> to not having pg's in  'clean active' state
>
>
>> -----Original Message-----
>> From: ricardo.re.azevedo@xxxxxxxxx <ricardo.re.azevedo@xxxxxxxxx>
>> Sent: 11 March 2021 00:59
>> To: ceph-users@xxxxxxx
>> Subject:  mon db growing. over 500Gb
>>
>> Hi all,
>>
>>
>>
>> I have a fairly pressing issue. I had a monitor fall out of quorum 
>> because it ran out of disk space during rebalancing from switching to 
>> upmap. I noticed all my monitor store.db started taking up nearly all 
>> disk space so I set noout, nobackfill and norecover and shutdown all 
>> the monitor daemons.
>> Each store.db was at:
>>
>>
>>
>> mon.a 89GB (the one that firt dropped out)
>>
>> mon.a 400GB
>>
>> mon.c 400GB
>>
>>
>> I tried setting mon_compact_on_start. This brought  mon.a down to 1GB.
>> Cool.
>> However, when I try it on the other monitors it increased the db size 
>> ~1Gb/10s so I shut them down again.
>>
>> Any idea what is going on? Or how can I shrik back down the db?
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an 
>> email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an 
> email to ceph-users-leave@xxxxxxx
>
--
Andreas John
net-lab GmbH  |  Frankfurter Str. 99  |  63067 Offenbach
Geschaeftsfuehrer: Andreas John | AG Offenbach, HRB40832
Tel: +49 69 8570033-1 | Fax: -2 | http://www.net-lab.net

Facebook: https://www.facebook.com/netlabdotnet
Twitter: https://twitter.com/netlabdotnet
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email
to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx