On 01/31/2017 07:12 PM, Shinobu Kinjo wrote:
On Wed, Feb 1, 2017 at 1:51 AM, Joao Eduardo Luis <joao@xxxxxxx> wrote:
On 01/31/2017 03:35 PM, David Turner wrote:
If you do have a large enough drive on all of your mons (and always
intend to do so) you can increase the mon store warning threshold in the
config file so that it no longer warns at 15360 MB.
And if you so decide to go that route, please be aware that the monitors are
known to misbehave if their store grows too much.
Would you please elaborate on what *misbehave* means? Do you have any
pointers to tell us more specifically?
In particular, when using leveldb, stalls while reading or writing to
the store - typically, leveldb is compacting when this happens. This
leads to all sorts of timeouts to be triggered, but the really annoying
one would be the lease timeout, which tends to result in flapping quorum.
Also, being unable to sync monitors. Again, stalls on leveldb lead to
timeouts being triggered and the sync to restart.
Once upon a time, this *may* have also translated into large memory
consumption. A direct relation was never proved though, and behaviour
went away as ceph became smarter, and libs were updated by distros.
-Joao
Those warnings have been put in place to let the admin know that action may
be needed, hopefully in time to avoid abhorrent behaviour.
-Joao
From: ceph-users [ceph-users-bounces@xxxxxxxxxxxxxx] on behalf of Wido
den Hollander [wido@xxxxxxxx]
Sent: Tuesday, January 31, 2017 2:35 AM
To: Martin Palma; CEPH list
Subject: Re: mon.mon01 store is getting too big! 18119 MB
= 15360 MB -- 94% avail
Op 31 januari 2017 om 10:22 schreef Martin Palma <martin@xxxxxxxx>:
Hi all,
our cluster is currently performing a big expansion and is in recovery
mode (we doubled in size and osd# from 600 TB to 1,2 TB).
Yes, that is to be expected. When not all PGs are active+clean the MONs
will not trim their datastore.
Now we get the following message from our monitor nodes:
mon.mon01 store is getting too big! 18119 MB >= 15360 MB -- 94% avail
Reading [0] it says that it is normal in a state of active data
rebalance and after it is finished it will be compacted.
Should we wait until the recovery is finished or should we perform
"ceph tell mon.{id} compact" now during recovery?
Mainly wait and make sure there is enough disk space. You can try a
compact, but that can take the mon offline temp.
Just make sure you have enough diskspace :)
Wido
Best,
Martin
[0] https://access.redhat.com/solutions/1982273
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com