> -----Original Message----- > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of > Wido den Hollander > Sent: 20 January 2016 15:27 > To: Zoltan Arnold Nagy <zoltan@xxxxxxxxxxxxxxxxxx> > Cc: ceph-users@xxxxxxxxxxxxxx > Subject: Re: Ceph monitors 100% full filesystem, refusing start > > On 01/20/2016 04:22 PM, Zoltan Arnold Nagy wrote: > > Hi Wido, > > > > So one out of the 5 monitors are running fine then? Did that have more > space for it’s leveldb? > > > > Yes. That was at 99% full and by cleaning some stuff in /var/cache and > /var/log I was able to start it. > > It compacted the levelDB database and is now on 1% disk usage. > > Looking at the ceph_mon.cc code: > > if (stats.avail_percent <= g_conf->mon_data_avail_crit) { > > Setting mon_data_avail_crit to 0 does not work since 100% full is equal to 0% > free.. > > There is ~300M free on the other 4 monitors. I just can't start the mon and > tell it to compact. > > Lessons learned here though, always make sure you have some additional > space you can clear when you need it. Slightly unrelated, but before the arrival of virtualisation, when I used to manage MS Exchange servers we always used to copy a DVD ISO onto the DB/Logs disk, so that in the event of a disk full scenario we could always instantly free up 4GB of space. Maybe something along those lines (dd /dev/zero to a file) would be good practice. > > >> On 20 Jan 2016, at 16:15, Wido den Hollander <wido@xxxxxxxx> wrote: > >> > >> Hello, > >> > >> I have an issue with a (not in production!) Ceph cluster which I'm > >> trying to resolve. > >> > >> On Friday the network links between the racks failed and this caused > >> all monitors to loose connection. > >> > >> Their leveldb stores kept growing and they are currently 100% full. > >> They all have a few hunderd MB left. > >> > >> Starting the 'compact on start' doesn't work since the FS is 100% > >> full.error: monitor data filesystem reached concerning levels of > >> available storage space (available: 0% 238 MB) you may adjust 'mon > >> data avail crit' to a lower value to make this go away (default: 0%) > >> > >> On of the 5 monitors is now running but that's not enough. > >> > >> Any ideas how to compact this leveldb? I can't free up any more space > >> right now on these systems. Getting bigger disks in is also going to > >> take a lot of time. > >> > >> Any tools outside the monitors to use here? > >> > >> Keep in mind, this is a pre-production cluster. We would like to keep > >> the cluster and fix this as a good exercise of stuff which could go > >> wrong. Dangerous tools are allowed! > >> > >> -- > >> Wido den Hollander > >> 42on B.V. > >> Ceph trainer and consultant > >> > >> Phone: +31 (0)20 700 9902 > >> Skype: contact42on > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@xxxxxxxxxxxxxx > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > > > > -- > Wido den Hollander > 42on B.V. > Ceph trainer and consultant > > Phone: +31 (0)20 700 9902 > Skype: contact42on > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com