On 09/16/2014 04:35 PM, Gregory Farnum wrote: > I don't really know; Joao has handled all these cases. I *think* they've > been tied to a few bad versions of LevelDB, but I'm not certain. (There > were a number of discussions about it on the public mailing lists.) > -Greg > > On Tuesday, September 16, 2014, Florian Haas <florian at hastexo.com > <mailto:florian at hastexo.com>> wrote: > > Hi Greg, > > just picked up this one from the archive while researching a different > issue and thought I'd follow up. > > On Tue, Aug 19, 2014 at 6:24 PM, Gregory Farnum <greg at inktank.com > <javascript:;>> wrote: > > The sst files are files used by leveldb to store its data; you cannot > > remove them. Are you running on a very small VM? How much space are > > the files taking up in aggregate? > > Speaking generally, I think you should see something less than a GB > > worth of data there, but some versions of leveldb under some > scenarios > > are known to misbehave and grow pretty large. > > Can you elaborate on the scenarios where leveldb is misbehaving? I've > also seen reports of this before, with .sst files growing to several > GB in size. Is this a cause for concern (for example, would you expect > mons to slow down) and if so, how would you recover? Would you > essentially nuke the mon and replace it with another? Forcing the monitor to compact on start and restarting the mon is the current workaround for overgrown ssts. This happens on a regular basis with some clusters and I've not been able to track down the source. It seems that leveldb keeps hold of previous, useless data and will only relinquish it upon being closed. When this happens monitors do not slow down per se but they tend to misbehave: hanging at times, spurious elections, flapping quorum. This is mostly because, up until recently, the monitors would wait on updates to be written to leveldb. Leveldb would in turn misbehave as (afaict) it's busy dealing with clutter. Sage pushed patches to master to have the monitor performing async writes to leveldb so to prevent the monitor hanging when leveldb hangs, and this should help quite a bit with all the weirdness. So to recap: current workaround is add 'mon compact on start = true' on your ceph.conf and restart the monitor. -Joao > > Cheers, > Florian > > > > -- > Software Engineer #42 @ http://inktank.com | http://ceph.com -- Joao Eduardo Luis Software Engineer | http://inktank.com | http://ceph.com