Re: BlueStore.cc: 9363: FAILED assert(0 == "unexpected error")

Nick Fisk <nick@xxxxxxxxxx> · Fri, 26 Jan 2018 13:23:24 -0000

I can see this in the logs:

2018-01-25 06:05:56.292124 7f37fa6ea700 -1 log_channel(cluster) log [ERR] : full status failsafe engaged, dropping updates, now 101% full
2018-01-25 06:05:56.325404 7f3803f9c700 -1 bluestore(/var/lib/ceph/osd/ceph-9) _do_alloc_write failed to reserve 0x4000
2018-01-25 06:05:56.325434 7f3803f9c700 -1 bluestore(/var/lib/ceph/osd/ceph-9) _do_write _do_alloc_write failed with (28) No space left on device
2018-01-25 06:05:56.325462 7f3803f9c700 -1 bluestore(/var/lib/ceph/osd/ceph-9) _txc_add_transaction error (28) No space left on device not handled on operation 10 (op 0, counting from 0)

Are they out of space, or is something mis-reporting?

Nick

From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of David Turner
Sent: 26 January 2018 13:03
To: ceph-users <ceph-users@xxxxxxxxxxxxxx>
Subject:  BlueStore.cc: 9363: FAILED assert(0 == "unexpected error")

http://tracker.ceph.com/issues/22796

I was curious if anyone here had any ideas or experience with this problem.  I created the tracker for this yesterday when I woke up to find all 3 of my SSD OSDs not running and unable to start due to this segfault.  These OSDs are in my small home cluster and hold the cephfs_cache and cephfs_metadata pools.

To recap, I upgraded from 10.2.10 to 12.2.2, successfully swapped out my 9 OSDs to Bluestore, reconfigured my crush rules to utilize OSD classes, failed to remove the CephFS cache tier due to http://tracker.ceph.com/issues/22754, created these 3 SSD OSDs and updated the cephfs_cache and cephfs_metadata pools to use the replicated_ssd crush rule... fast forward 2 days of this working great to me waking up with all 3 of them crashed and unable to start.  There is an OSD log with debug bluestore = 5 attached to the tracker at the top of the email.

My CephFS is completely down while these 2 pools are inaccessible.  The OSDs themselves are in-tact if I need to move the data out manually to the HDDs or something.  Any help is appreciated.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com