On Wed, 12 Jun 2019, Simon Leinen wrote: > Dear Sage, > > > Also, can you try ceph-bluestore-tool bluefs-export on this osd? I'm > > pretty sure it'll crash in the same spot, but just want to confirm > > it's a bluefs issue. > > To my surprise, this actually seems to have worked: > > $ time sudo ceph-bluestore-tool --out-dir /mnt/ceph bluefs-export --path /var/lib/ceph/osd/ceph-49 > inferring bluefs devices from bluestore path > slot 2 /var/lib/ceph/osd/ceph-49/block -> /dev/dm-9 > slot 1 /var/lib/ceph/osd/ceph-49/block.db -> /dev/dm-8 > db/ > db/072900.sst > db/072901.sst > db/076487.sst > db/076488.sst > db/076489.sst > db/076490.sst > [...] > db/079726.sst > db/079727.log > db/CURRENT > db/IDENTITY > db/LOCK > db/MANIFEST-053662 > db/OPTIONS-053662 > db/OPTIONS-053665 > db.slow/ > db.slow/049192.sst > db.slow/049193.sst > db.slow/049831.sst > db.slow/057443.sst > db.slow/057444.sst > db.slow/058254.sst > [...] > db.slow/079718.sst > db.slow/079719.sst > db.slow/079720.sst > db.slow/079721.sst > db.slow/079722.sst > db.slow/079723.sst > db.slow/079724.sst > > real 5m19.953s > user 0m0.101s > sys 1m5.571s > leinen@unil0047:/var/lib/ceph/osd/ceph-49$ > > It left 3GB in /mnt/ceph/db (55 files of varying sizes), > > and 39GB in /mnt/ceph/db.slow (620 files of mostly 68MB each). What happens if you do ceph-kvstore-tool rocksdb /mnt/ceph/db stats or, if htat works, ceph-kvstore-tool rocksdb /mnt/ceph/db compact It looks like bluefs is happy (in that it can read the whole set of rocksdb files), so the questoin is if rocksdb can open them, or if there's some corruption or problem at the rocksdb level. The original crash is actually here: ... 9: (tc_new()+0x283) [0x7fbdbed8e943] 10: (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long)+0x69) [0x5600b1268109] 11: (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long)+0x63) [0x5600b12f5b43] 12: (rocksdb::BlockBuilder::Add(rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::Slice const*)+0x10b) [0x5600b1eaca9b] ... where tc_new is (I think) tcmalloc. Which looks to me like rocksdb is probably trying to allocate something very big. The question is will that happen with the exported files or only on bluefs... Thanks! sage _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com