Re: rocksdb mon stores growing until restart

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 19, 2018 at 7:01 PM Bryan Stillwell <bstillwell@xxxxxxxxxxx> wrote:
>
> > On 08/30/2018 11:00 AM, Joao Eduardo Luis wrote:
> > > On 08/30/2018 09:28 AM, Dan van der Ster wrote:
> > > Hi,
> > > Is anyone else seeing rocksdb mon stores slowly growing to >15GB,
> > > eventually triggering the 'mon is using a lot of disk space' warning?
> > > Since upgrading to luminous, we've seen this happen at least twice.
> > > Each time, we restart all the mons and then stores slowly trim down to
> > > <500MB. We have 'mon compact on start = true', but it's not the
> > > compaction that's shrinking the rockdb's -- the space used seems to
> > > decrease over a few minutes only after *all* mons have been restarted.
> > > This reminds me of a hammer-era issue where references to trimmed maps
> > > were leaking -- I can't find that bug at the moment, though.
> >
> > Next time this happens, mind listing the store contents and check if you
> > are holding way too many osdmaps? You shouldn't be holding more osdmaps
> > than the default IF the cluster is healthy and all the pgs are clean.
> >
> > I've chased a bug pertaining this last year, even got a patch, but then
> > was unable to reproduce it. Didn't pursue merging the patch any longer
> > (I think I may still have an open PR for it though), simply because it
> > was no longer clear if it was needed.
>
> I just had this happen to me while using ceph-gentle-split on a 12.2.5
> cluster with 1,370 OSDs.  Unfortunately, I restarted the mon nodes which
> fixed the problem before finding this thread.  I'm only halfway done
> with the split, so I'll see if the problem resurfaces again.
>

I think I've understood the what's causing this -- it's related to the
issue we've seen where osdmaps are not being trimmed on osds.
It seems that once the oldest_map and newest_map are within 500, they
are no longer trimmed ever until the mon's are restarted.

I updated this tracker: http://tracker.ceph.com/issues/37875

-- dan
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux