Is the StupidAllocator supported in Luminous?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I am seeing OOM issues with some of my OSD nodes that I am testing with Bluestore on 12.2.0, so I decided to try the StupidAllocator to see if it has a smaller memory footprint, by setting the following in my ceph.conf:

bluefs_allocator = stupid
bluestore_cache_size_hdd = 1073741824
bluestore_cache_size_ssd = 1073741824

With these settings I am no longer seeing OOM errors, but on the node with these setting, overnight I have seen multiple Aborted messages in my log files:

grep Abort *log
ceph-osd.10.log:2017-09-09 12:39:28.573034 7f2816f45700 -1 *** Caught signal (Aborted) **
ceph-osd.10.log:     0> 2017-09-09 12:39:28.573034 7f2816f45700 -1 *** Caught signal (Aborted) **
ceph-osd.11.log:2017-09-09 11:39:16.835793 7fdcf6b08700 -1 *** Caught signal (Aborted) **
ceph-osd.11.log:     0> 2017-09-09 11:39:16.835793 7fdcf6b08700 -1 *** Caught signal (Aborted) **
ceph-osd.3.log:2017-09-09 07:10:58.565465 7fa2e96c8700 -1 *** Caught signal (Aborted) **
ceph-osd.3.log:2017-09-09 07:49:56.256899 7f89edf90700 -1 *** Caught signal (Aborted) **
ceph-osd.3.log:     0> 2017-09-09 07:49:56.256899 7f89edf90700 -1 *** Caught signal (Aborted) **
ceph-osd.3.log:2017-09-09 08:13:16.919887 7f82f315e700 -1 *** Caught signal (Aborted) **
ceph-osd.7.log:2017-09-09 09:19:17.281950 7f77824cf700 -1 *** Caught signal (Aborted) **
ceph-osd.7.log:     0> 2017-09-09 09:19:17.281950 7f77824cf700 -1 *** Caught signal (Aborted) **

Before I open a ticket, I just want to know if the StupidAllocator is supported in Luminous.

A couple of examples of the Aborts are:

2017-09-09 12:39:27.044074 7f27f5f20700  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1504975167035909, "job": 86, "event": "flush_started", "num_memtables": 1, "num_entries": 1015543, "num_deletes": 345553, "memory_usage": 260049176}
2017-09-09 12:39:27.044088 7f27f5f20700  4 rocksdb: [/build/ceph-12.2.0/src/rocksdb/db/flush_job.cc:293] [default] [JOB 86] Level-0 flush table #1825: started
2017-09-09 12:39:28.234651 7f27fff34700 -1 osd.10 pg_epoch: 3521 pg[1.3c7( v 3521'372186 (3456'369135,3521'372186] local-lis/les=3488/3490 n=2842 ec=578/66 lis/c 3488/3488 les/c/f 3490/3500/0 3488/3488/3477) [10,8,16] r=0 lpr=3488 crt=3521'372186 lcod 3521'372184 mlcod 3521'372184 active+clean+snaptrim snaptrimq=[111~2,115~2,13a~1,13c~3]] removing snap head
2017-09-09 12:39:28.573034 7f2816f45700 -1 *** Caught signal (Aborted) **
 in thread 7f2816f45700 thread_name:msgr-worker-2

 ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)
 1: (()+0xa562f4) [0x5634e14882f4]
 2: (()+0x11390) [0x7f281b2c5390]
 3: (gsignal()+0x38) [0x7f281a261428]
 4: (abort()+0x16a) [0x7f281a26302a]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x16d) [0x7f281aba384d]
 6: (()+0x8d6b6) [0x7f281aba16b6]
 7: (()+0x8d701) [0x7f281aba1701]
 8: (()+0xb8d38) [0x7f281abccd38]
 9: (()+0x76ba) [0x7f281b2bb6ba]
 10: (clone()+0x6d) [0x7f281a33282d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
-10000> 2017-09-09 12:39:05.878006 7f2817746700  1 -- 172.16.2.133:6804/1327479 <== osd.2 172.16.2.131:6800/1710 37506 ==== osd_repop(mds.0.19:101159707 1.2f1 e3521/3477) v2 ==== 998+0+46 (52256346 0 1629833233) 0x56359eb29000 con 0x563510c02000
 -9999> 2017-09-09 12:39:05.878065 7f2816f45700  1 -- 10.15.2.133:6805/327479 <== mds.0 10.15.2.123:6800/2942775562 55580 ==== osd_op(mds.0.19:101159714 1.ec 1.ffad68ec (undecoded) ondisk+write+known_if_redirected+full_force e3521) v8 ==== 305+0+366 (2883828331 0 2609552142) 0x56355d9eb0c0 con 0x56355f455000


Second example:
2017-09-09 07:10:58.135527 7fa2d56a0700  4 rocksdb: [/build/ceph-12.2.0/src/rocksdb/db/flush_job.cc:264] [default] [JOB 10] Flushing memtable with next log file: 2773

2017-09-09 07:10:58.262058 7fa2d56a0700  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1504955458135538, "job": 10, "event": "flush_started", "num_memtables": 1, "num_entries": 935059, "num_deletes": 175946, "memory_usage": 260049888}
2017-09-09 07:10:58.262077 7fa2d56a0700  4 rocksdb: [/build/ceph-12.2.0/src/rocksdb/db/flush_job.cc:293] [default] [JOB 10] Level-0 flush table #2774: started
2017-09-09 07:10:58.565465 7fa2e96c8700 -1 *** Caught signal (Aborted) **
 in thread 7fa2e96c8700 thread_name:bstore_kv_sync

 ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)
 1: (()+0xa562f4) [0x5579585362f4]
 2: (()+0x11390) [0x7fa2faa45390]
 3: (gsignal()+0x38) [0x7fa2f99e1428]
 4: (abort()+0x16a) [0x7fa2f99e302a]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x16d) [0x7fa2fa32384d]
 6: (()+0x8d6b6) [0x7fa2fa3216b6]
 7: (()+0x8d701) [0x7fa2fa321701]
 8: (()+0x8d919) [0x7fa2fa321919]
 9: (()+0x1230f) [0x7fa2fb60b30f]
 10: (operator new[](unsigned long)+0x4e7) [0x7fa2fb62f4b7]
 11: (rocksdb::Arena::AllocateNewBlock(unsigned long)+0x70) [0x557958939150]
 12: (rocksdb::Arena::AllocateFallback(unsigned long, bool)+0x45) [0x5579589392d5]
 13: (rocksdb::Arena::AllocateAligned(unsigned long, unsigned long, rocksdb::Logger*)+0x100) [0x557958939460]
 14: (rocksdb::ConcurrentArena::AllocateAligned(unsigned long, unsigned long, rocksdb::Logger*)+0x175) [0x5579588a80d5]
 15: (()+0xe01d03) [0x5579588e1d03]
 16: (()+0xe024dd) [0x5579588e24dd]
 17: (rocksdb::MemTable::Add(unsigned long, rocksdb::ValueType, rocksdb::Slice const&, rocksdb::Slice const&, bool, rocksdb::MemTablePostProcessInfo*)+0x109) [0x5579588a3629]
 18: (rocksdb::MemTableInserter::PutCF(unsigned int, rocksdb::Slice const&, rocksdb::Slice const&)+0x39c) [0x5579588db6cc]
 19: (rocksdb::WriteBatch::Iterate(rocksdb::WriteBatch::Handler*) const+0x5b7) [0x5579588d4fa7]
 20: (rocksdb::WriteBatchInternal::InsertInto(rocksdb::autovector<rocksdb::WriteThread::Writer*, 8ul> const&, unsigned long, rocksdb::ColumnFamilyMemTables*, rocksdb::FlushScheduler*, bool, unsigned long, rocksdb::DB*, bool)+0x14b) [0x5579588d8dcb]
 21: (rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool)+0x14a7) [0x55795899e2c7]
 22: (rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*)+0x2a) [0x55795899ed2a]
 23: (RocksDBStore::submit_transaction(std::shared_ptr<KeyValueDB::TransactionImpl>)+0xaf) [0x55795847751f]
 24: (BlueStore::_kv_sync_thread()+0x23dc) [0x55795841150c]
 25: (BlueStore::KVSyncThread::entry()+0xd) [0x55795845453d]
 26: (()+0x76ba) [0x7fa2faa3b6ba]
 27: (clone()+0x6d) [0x7fa2f9ab282d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

If the StupidAllocator is supported, I will open up a ticket. 

Eric

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux