Re: [ceph-users] ceph mon_data_size_warn limits for large cluster

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Thu, 7 Feb 2019 16:13:22 +0100

On Thu, Feb 7, 2019 at 4:12 PM M Ranga Swami Reddy <swamireddy@xxxxxxxxx> wrote:
>
> >Compaction isn't necessary -- you should only need to restart all
> >peon's then the leader. A few minutes later the db's should start
> >trimming.
>
> As we on production cluster, which may not be safe to restart the
> ceph-mon, instead prefer to do the compact on non-leader mons.
> Is this ok?
>

Compaction doesn't solve this particular problem, because the maps
have not yet been deleted by the ceph-mon process.

-- dan

> Thanks
> Swami
>
> On Thu, Feb 7, 2019 at 6:30 PM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
> >
> > On Thu, Feb 7, 2019 at 12:17 PM M Ranga Swami Reddy
> > <swamireddy@xxxxxxxxx> wrote:
> > >
> > > Hi Dan,
> > > >During backfilling scenarios, the mons keep old maps and grow quite
> > > >quickly. So if you have balancing, pg splitting, etc. ongoing for
> > > >awhile, the mon stores will eventually trigger that 15GB alarm.
> > > >But the intended behavior is that once the PGs are all active+clean,
> > > >the old maps should be trimmed and the disk space freed.
> > >
> > > old maps not trimmed after cluster reached to "all+clean" state for all PGs.
> > > Is there (known) bug here?
> > > As the size of dB showing > 15G, do I need to run the compact commands
> > > to do the trimming?
> >
> > Compaction isn't necessary -- you should only need to restart all
> > peon's then the leader. A few minutes later the db's should start
> > trimming.
> >
> > -- dan
> >
> >
> > >
> > > Thanks
> > > Swami
> > >
> > > On Wed, Feb 6, 2019 at 6:24 PM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
> > > >
> > > > Hi,
> > > >
> > > > With HEALTH_OK a mon data dir should be under 2GB for even such a large cluster.
> > > >
> > > > During backfilling scenarios, the mons keep old maps and grow quite
> > > > quickly. So if you have balancing, pg splitting, etc. ongoing for
> > > > awhile, the mon stores will eventually trigger that 15GB alarm.
> > > > But the intended behavior is that once the PGs are all active+clean,
> > > > the old maps should be trimmed and the disk space freed.
> > > >
> > > > However, several people have noted that (at least in luminous
> > > > releases) the old maps are not trimmed until after HEALTH_OK *and* all
> > > > mons are restarted. This ticket seems related:
> > > > http://tracker.ceph.com/issues/37875
> > > >
> > > > (Over here we're restarting mons every ~2-3 weeks, resulting in the
> > > > mon stores dropping from >15GB to ~700MB each time).
> > > >
> > > > -- Dan
> > > >
> > > >
> > > > On Wed, Feb 6, 2019 at 1:26 PM Sage Weil <sage@xxxxxxxxxxxx> wrote:
> > > > >
> > > > > Hi Swami
> > > > >
> > > > > The limit is somewhat arbitrary, based on cluster sizes we had seen when
> > > > > we picked it.  In your case it should be perfectly safe to increase it.
> > > > >
> > > > > sage
> > > > >
> > > > >
> > > > > On Wed, 6 Feb 2019, M Ranga Swami Reddy wrote:
> > > > >
> > > > > > Hello -  Are the any limits for mon_data_size for cluster with 2PB
> > > > > > (with 2000+ OSDs)?
> > > > > >
> > > > > > Currently it set as 15G. What is logic behind this? Can we increase
> > > > > > when we get the mon_data_size_warn messages?
> > > > > >
> > > > > > I am getting the mon_data_size_warn message even though there a ample
> > > > > > of free space on the disk (around 300G free disk)
> > > > > >
> > > > > > Earlier thread on the same discusion:
> > > > > > https://www.spinics.net/lists/ceph-users/msg42456.html
> > > > > >
> > > > > > Thanks
> > > > > > Swami
> > > > > >
> > > > > >
> > > > > >
> > > > > _______________________________________________
> > > > > ceph-users mailing list
> > > > > ceph-users@xxxxxxxxxxxxxx
> > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com