On 01/20/2016 04:25 PM, Joao Eduardo Luis wrote: > On 01/20/2016 03:15 PM, Wido den Hollander wrote: >> Hello, >> >> I have an issue with a (not in production!) Ceph cluster which I'm >> trying to resolve. >> >> On Friday the network links between the racks failed and this caused all >> monitors to loose connection. >> >> Their leveldb stores kept growing and they are currently 100% full. They >> all have a few hunderd MB left. > > I'm incredibly curious to know what was written to leveldb to bring it > to grow unbounded. Did the monitors hold quorum? I'm guessing that would > be a 'no', given the network failure you mentioned, hence my morbid > curiosity in figuring out what happened there. > Yes, quorum got lost. Monitors are in different racks and the core switching failed. Since it was pre-production people didn't notice until Tuesday. > If you don't mind, running a 'ceph-kvstore-tool /path/to/store.db > leveldb list > /tmp/store.dump' could, maybe, shed some light on this > issue (at least it will dump all the keys, and maybe something will be > obvious, don't know). I'd certainly be interested in taking a look at > those stores if you don't mind ;) > This is a 1800 OSD cluster and a ceph-kvstore-tool <path> list shows me a lot, but I mean, a lot of osdmaps. I think that stuff failed horribly due to the network flapping. Running just the list already compacted leveldb btw. I have free space again and the monitors are starting. Waiting for them to form a quorum again. >> Starting the 'compact on start' doesn't work since the FS is 100% >> full.error: monitor data filesystem reached concerning levels of >> available storage space (available: 0% 238 MB) >> you may adjust 'mon data avail crit' to a lower value to make this go >> away (default: 0%) >> >> On of the 5 monitors is now running but that's not enough. >> >> Any ideas how to compact this leveldb? I can't free up any more space >> right now on these systems. Getting bigger disks in is also going to >> take a lot of time. > > Running 'ceph-kvstore-tool' may also force leveldb to compact on open, > so you may have a shot there at compaction. If that doesn't work, > 'ceph-monstore-tool' has a 'compact' command -- that should help you > sort it out. > > -Joao > -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com