what are these files for mon?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/16/2014 04:35 PM, Gregory Farnum wrote:
> I don't really know; Joao has handled all these cases. I *think* they've
> been tied to a few bad versions of LevelDB, but I'm not certain. (There
> were a number of discussions about it on the public mailing lists.)
> -Greg
>
> On Tuesday, September 16, 2014, Florian Haas <florian at hastexo.com
> <mailto:florian at hastexo.com>> wrote:
>
>     Hi Greg,
>
>     just picked up this one from the archive while researching a different
>     issue and thought I'd follow up.
>
>     On Tue, Aug 19, 2014 at 6:24 PM, Gregory Farnum <greg at inktank.com
>     <javascript:;>> wrote:
>      > The sst files are files used by leveldb to store its data; you cannot
>      > remove them. Are you running on a very small VM? How much space are
>      > the files taking up in aggregate?
>      > Speaking generally, I think you should see something less than a GB
>      > worth of data there, but some versions of leveldb under some
>     scenarios
>      > are known to misbehave and grow pretty large.
>
>     Can you elaborate on the scenarios where leveldb is misbehaving? I've
>     also seen reports of this before, with .sst files growing to several
>     GB in size. Is this a cause for concern (for example, would you expect
>     mons to slow down) and if so, how would you recover? Would you
>     essentially nuke the mon and replace it with another?

Forcing the monitor to compact on start and restarting the mon is the 
current workaround for overgrown ssts.  This happens on a regular basis 
with some clusters and I've not been able to track down the source.  It 
seems that leveldb keeps hold of previous, useless data and will only 
relinquish it upon being closed.

When this happens monitors do not slow down per se but they tend to 
misbehave: hanging at times, spurious elections, flapping quorum.  This 
is mostly because, up until recently, the monitors would wait on updates 
to be written to leveldb. Leveldb would in turn misbehave as (afaict) 
it's busy dealing with clutter.  Sage pushed patches to master to have 
the monitor performing async writes to leveldb so to prevent the monitor 
hanging when leveldb hangs, and this should help quite a bit with all 
the weirdness.

So to recap: current workaround is add 'mon compact on start = true' on 
your ceph.conf and restart the monitor.

   -Joao


>
>     Cheers,
>     Florian
>
>
>
> --
> Software Engineer #42 @ http://inktank.com | http://ceph.com


-- 
Joao Eduardo Luis
Software Engineer | http://inktank.com | http://ceph.com


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux