Re: Bluestore runs out of space and dies

Nathan Fish <lordcirth@xxxxxxxxx> · Thu, 31 Oct 2019 10:42:57 -0400

You should never run into this problem on a 4TB drive in the first
place. Bluestore explodes if it can't allocate a few GB; but on a 4TB
drive, the default full_ratio of 0.95 will forbid placing any new
objects onto an OSD with less than 200GB.

On Thu, Oct 31, 2019 at 9:31 AM George Shuklin <george.shuklin@xxxxxxxxx> wrote:
>
> Thank you everyone, I got it. There is no way to fix out-of-space
> bluestore without expanding it.
>
> Therefore, in production we would stick with 99%FREE size for LV, as it
> gives operators 'last chance' to repair the cluster in case of
> emergency. It's a bit unfortunate that we need to give up the whole per
> cent (1 % is too much for 4Tb drives).
>
> On 31/10/2019 15:04, Nathan Fish wrote:
> > The best way to prevent this on a testing cluster with tiny virtual
> > drives is probably to lower the various full_ratio's significantly.
> >
> > On Thu, Oct 31, 2019 at 7:17 AM Paul Emmerich <paul.emmerich@xxxxxxxx> wrote:
> >> BlueStore doesn't handle running out of space gracefully because that
> >> doesn't happen on a real disk because full_ratio (95%) and the
> >> failsafe_full_ratio (97%? some obscure config option) kick in before
> >> that happens.
> >>
> >> Yeah, I've also lost some test cluster with tiny disks to this. Usual
> >> reason is keeping it in a degraded state for weeks at a time...
> >>
> >> Paul
> >>
> >> --
> >> Paul Emmerich
> >>
> >> Looking for help with your Ceph cluster? Contact us at https://croit.io
> >>
> >> croit GmbH
> >> Freseniusstr. 31h
> >> 81247 München
> >> www.croit.io
> >> Tel: +49 89 1896585 90
> >>
> >> On Thu, Oct 31, 2019 at 10:50 AM George Shuklin
> >> <george.shuklin@xxxxxxxxx> wrote:
> >>> Hello.
> >>>
> >>> In my lab a nautilus cluster with a bluestore suddenly went dark. As I
> >>> found it had used 98% of the space and most of OSDs (small, 10G each)
> >>> went offline. Any attempt to restart them failed with this message:
> >>>
> >>> # /usr/bin/ceph-osd -f --cluster  ceph --id 18 --setuser ceph --setgroup
> >>> ceph
> >>>
> >>> 2019-10-31 09:44:37.591 7f73d54b3f80 -1 osd.18 271 log_to_monitors
> >>> {default=true}
> >>> 2019-10-31 09:44:37.615 7f73bff99700 -1
> >>> bluestore(/var/lib/ceph/osd/ceph-18) _do_alloc_write failed to allocate
> >>> 0x10000 allocated 0x ffffffffffffffe4 min_alloc_size 0x10000 available 0x 0
> >>> 2019-10-31 09:44:37.615 7f73bff99700 -1
> >>> bluestore(/var/lib/ceph/osd/ceph-18) _do_write _do_alloc_write failed
> >>> with (28) No space left on device
> >>> 2019-10-31 09:44:37.615 7f73bff99700 -1
> >>> bluestore(/var/lib/ceph/osd/ceph-18) _txc_add_transaction error (28) No
> >>> space left on device not handled on operation 10 (op 30, counting from 0)
> >>> 2019-10-31 09:44:37.615 7f73bff99700 -1
> >>> bluestore(/var/lib/ceph/osd/ceph-18) ENOSPC from bluestore,
> >>> misconfigured cluster
> >>> /build/ceph-14.2.4/src/os/bluestore/BlueStore.cc: In function 'void
> >>> BlueStore::_txc_add_transaction(BlueStore::TransContext*,
> >>> ObjectStore::Transaction*)' thread 7f73bff99700 time 2019-10-31
> >>> 09:44:37.620694
> >>> /build/ceph-14.2.4/src/os/bluestore/BlueStore.cc: 11455:
> >>> ceph_abort_msg("unexpected error")
> >>>
> >>> I was able to recover cluster by adding some more space into VGs for
> >>> some of OSDs and using this command:
> >>>
> >>> ceph-bluestore-tool --log-level 30 --path /var/lib/ceph/osd/ceph-xx
> >>> --command bluefs-bdev-expand
> >>>
> >>> It worked but only because I added some space into OSD.
> >>>
> >>> I'm curious, is there a way to recover such OSD without growing it? On
> >>> the old filestore I can just remove some objects to gain space, is this
> >>> possible for bluestore? My main concern is that OSD daemon simply
> >>> crashes at start, so I can't just add 'more OSD' to cluster - all data
> >>> become unavailable, because OSDs are completely dead.
> >>>
> >>> _______________________________________________
> >>> ceph-users mailing list
> >>> ceph-users@xxxxxxxxxxxxxx
> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users@xxxxxxxxxxxxxx
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com