On Wed, 12 Jun 2019, Sage Weil wrote: > On Thu, 13 Jun 2019, Simon Leinen wrote: > > Sage Weil writes: > > >> 2019-06-12 23:40:43.555 7f724b27f0c0 1 rocksdb: do_open column families: [default] > > >> Unrecognized command: stats > > >> ceph-kvstore-tool: /build/ceph-14.2.1/src/rocksdb/db/version_set.cc:356: rocksdb::Version::~Version(): Assertion `path_id < cfd_->ioptions()->cf_paths.size()' failed. > > >> *** Caught signal (Aborted) ** > > > > > Ah, this looks promising.. it looks like it got it open and has some > > > problem with teh error/teardown path. > > > > > Try 'compact' instead of 'stats'? > > > > That run for a while and then crashed, also in the destructor for > > rocksdb::Version, but with an otherwise different backtrace. I'm > > attaching the log again. > > Hmm, I'm pretty sure this is a shutdown problem, but not certain. If you > do > > ceph-kvstore-tool rocksdb /mnt/ceph/db list > keys > > is the keys file huge? Can you send the head and tail of it so we can > make sure it looks complete? > > One last thing to check: > > ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-NNN list > keys > > and see if that behaves similarly or crashes in the way it did before when > the OSD was starting. One other thing to try before taking any drastic steps (as described below): ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-NNN fsck And, if we do the below, > If the exported version looks intact, I have a workaround that will > make the osd use that external rocksdb db instead of the embedded one... > basically, > > - symlink the db, db.wal, db.slow files from the osd dir > (/var/lib/ceph/osd/ceph-NNN/db -> ... etc) > - ceph-bluestore-tool --dev /var/lib/ceph/osd/ceph-NNN/block set-label-key -k bluefs -v 0 > - start osd ...then before starting the OSD we should again do ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-NNN fsck sage > > but be warned this is fragile: there isn't a bluefs import function, so > this OSD will be permanently in that weird state. The goal will be to get > it up and the PG/cluster behaving, and then eventually let rados recover > elsewhere and reprovision this osd. > > But first, let's make sure the external rocksdb has a complete set of > keys! > > sage > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com